Runtime API
These endpoints expose or mutate daemon-wide runtime state. They are operational controls, not per-run settings.Endpoint inventory
Inspection:GET /v1/statusGET /v1/capabilitiesGET /v1/runtimeGET /v1/runtime/learning-policyGET /v1/runtime/run-memory-policyGET /v1/runtime/tool-limitsGET /v1/runtime/secretsGET /v1/runtime/secrets/{secret_ref}GET /v1/runtime/auth/accountsGET /v1/runtime/auth/accounts/{slot_id}GET /v1/runtime/auth/subjects/{subject_id}GET /v1/runtime/auth/leases/{lease_id}GET /v1/runtime/connectorsGET /v1/runtime/connectors/external/metricsGET /v1/runtime/connectors/{kind}/{name}GET /v1/runtime/hooksGET /v1/runtime/revisionsGET /v1/skillsGET /v1/skills/{skill_name}
POST /v1/runtime/modelPOST /v1/runtime/learning-policyPOST /v1/runtime/run-memory-policyPOST /v1/runtime/tool-limitsPOST /v1/runtime/secretsPOST /v1/runtime/auth/accountsPOST /v1/runtime/auth/accounts/mcp-oauthPOST /v1/runtime/auth/accounts/{slot_id}/refreshPOST /v1/runtime/auth/accounts/{slot_id}/revokePOST /v1/runtime/auth/subjects/{subject_id}/revokePOST /v1/runtime/auth/leases/{lease_id}/revokeDELETE /v1/runtime/auth/accounts/{slot_id}DELETE /v1/runtime/secrets/{secret_ref}POST /v1/runtime/permission-modePOST /v1/runtime/permissions/checkPOST /v1/runtime/system-promptPOST /v1/runtime/hooksPOST /v1/runtime/debug-levelPOST /v1/runtime/rollbackPUT /v1/runtime/connectors/{kind}/{name}DELETE /v1/runtime/connectors/{kind}/{name}
GET /v1/events/stream
GET /v1/status and GET /v1/runtime keep hook shape but redact executor secrets on read-only surfaces.
POST /v1/runtime/permissions/check
Dry-runs one permission decision without running hooks, creating approvals, executing a tool, or persisting audit records.
Request body:
PermissionExplanation with the effective mode, decision (allow, ask, or deny), optional base_decision, optional mode_applied, optional mode_effect, optional matched_rule, approval_required, scope, optional reason, audit_preview, hooks_evaluated=false, and hook_policy=not_executed_in_dry_run. base_decision is the static rule decision before permission mode transformations, and matched_rule is selected before mode application. base_decision and mode_applied are optional for mixed-version CLI/daemon compatibility; current daemons include them. If session_id is supplied and no such session exists, the endpoint returns 404 session_not_found.
GET /v1/status
This is the daemon-owned operator snapshot used by kheish-daemon status and kheish-daemon doctor.
The response preserves the original top-level readiness contract:
status:readyordrainingready: boolean readiness flagcapabilities: the same feature advertisement returned by/v1/capabilities
runtime: current route, model, permission mode, redacted hook settings, MCP, skills, tool runtime limits, and debug levelrun_memory: recovered run-memory policy, indexed counts, stale indexed count, and monotonic counterssession_memory: prompt-visible learned-context counters, including final runtime prompt-budget omissionssessions.totalruns: lifecycle counts, pending approval/question counts, max queue depth, oldest queued-run lag, oldest non-terminal run age, oldest non-terminal idle time, and a bounded sample of runs idle past the stale thresholdschedules: lifecycle counts, due/backoff counts, queued fire count, in-flight count, and next due timedelivery: output delivery queue counts, target circuit/backpressure counts, next retry timing, worker heartbeat,worker_lag_ms, plusstatus_error_countandstatus_errorwhen persisted delivery ledgers or the derived status summary need operator attentionagents: status counts, live runtime count, sidechain count, closed count, terminal snapshot count, and mailbox message counttasks: live background shell task count plus durable task counts forpending,in_progress,blocked,completed,failed, andcancelledstorage: write-health probes for the daemon state root and workspace rootprovider_readiness: per-route provider/account readiness derived from route config and referenced auth slotscontrol_plane: bind/auth/CORS posture, including effective admin/read-only token availability and redacted token-file load errorsevents: SSE replay-buffer capacity, retained count, oldest/newest/next event ids, subscriber count, replay cursor gap count, live stream lag count, and retained per-session/per-run eviction metadata countshealth: aggregate route counts, scheduler lag, snapshot build duration, and operator warnings with stableseverity,code,message, optionalrelated_id, and optionalaction; inline provider credentials are reported as aninfowarning so supported inline-key route files remain healthy
storage.probes[] performs a bounded create/write/sync/delete probe off the async runtime. Each probe has a timeout; a failed or timed-out probe sets storage.ok=false, increments storage.write_error_count, and emits a storage_write_probe_failed health warning with an action that points at the affected startup flag.
storage.state_root_lock reports the daemon lock file, whether the current daemon owns it, and the platform mechanism such as flock. On older daemons this field can be absent, so Doctor treats absence as compatibility rather than a hard failure.
provider_readiness.routes[] reports each route id, provider, model, capability matrix, whether it is the active route, the auth reference, and readiness state:
ok: the route has inline startup credentials, a matching healthy auth slot, or it is a non-authenticated internal/scripted providerwarning: credentials are usable but close to expiry or the last refresh outcome was not successfulerror: the route references a missing slot, a mismatched provider, an expired credential, or an unreadable auth status
health.warnings[] includes queued_run_lag when the oldest queued run exceeds the status threshold, stale_non_terminal_runs when active/waiting work has been idle past the stale threshold, provider_active_route_not_ready when the active route is not usable even if the route inventory did not emit a more specific provider-readiness error, event_stream_lagged when live SSE consumers skipped broadcast events, delivery_status_unavailable when delivery ledgers cannot be summarized, and control-plane auth warnings for missing effective admin tokens, unavailable token files, or duplicate effective tokens.
The default run-warning thresholds are 30 minutes. Operators can tune them with KHEISH_QUEUED_RUN_LAG_WARNING_THRESHOLD_MS and KHEISH_STALE_NON_TERMINAL_RUN_THRESHOLD_MS; the effective values are echoed in status.runs.queued_run_lag_threshold_ms and status.runs.stale_non_terminal_run_threshold_ms.
Delivery terminal counts are served from a derived summary keyed by append-only ledger byte offsets, so /v1/status does not re-scan historical delivery ledgers. If those offsets fall behind the ledgers, delivery.status_error_count and delivery.status_error flag the summary as stale until the daemon repairs it on the next terminal delivery mutation or restart.
Status collection also expires due structured questions before counting pending user questions, so the snapshot matches the question detail endpoints instead of reporting stale waits.
kheish-daemon doctor
doctor is the CLI-side diagnostic wrapper around /v1/status. It prints the status payload plus structured checks for:
/readyz/v1/events/stream, including a parseable SSE heartbeat/event frame- event replay counters from
status.events, including lagged live SSE consumers and replay cursor gaps - control-plane auth posture, including effective token-file availability and duplicate effective tokens
- CORS policy, and an actual browser-style preflight for
POSTplusAuthorization/Content-Typewhen--cors-origin <origin>is provided - route inventory and provider readiness
- storage probes and state-root lock ownership
- static hook configuration problems plus HTTP hook DNS target checks for private/local address resolution
- server-side
health.warnings, preserving each warningactionandrelated_id
0 when no error-severity check is present, 1 when Doctor found daemon health errors, 2 for invalid requests or oversized payloads (400/413), 3 when the daemon cannot be reached, 4 for auth failures (401/403), 5 for missing resources (404), 6 for conflicts (409), 7 for retryable transport/status failures such as timeouts, 429, or 503, and 8 for status schema mismatch.
doctor routes --routes-file <path> validates a route file without starting the daemon. It reports inline-key warnings, missing api_key_env, missing explicit OpenAI/Anthropic auth files, invalid base_url values, unsupported capability overrides, unknown fields, and default-route problems. OpenAI Codex account-auth routes that try to re-enable image/audio/transcription capabilities are rejected here, matching serve.
Use doctor routes --routes-file <path> --default-route <route_id> when you want the diagnostic to mirror a planned serve --routes-file <path> --default-route <route_id> command. The override must name an existing route, and multi-route files still need a valid file-level default_route.
doctor routes --check-auth additionally checks referenced daemon auth slots against the running daemon, including missing slots, provider mismatches, expired credentials, and refresh warnings. For route files, it also checks implicit account-auth slots such as route.<route_id> for openai_auth_source = "codex" and anthropic_auth_source = "claude_code", or verifies that a source credentials file is available for startup import.
doctor routes --check-references checks the running daemon for persisted route references that no longer exist in the runtime route inventory. It currently inspects session route policies, non-terminal schedules, and non-terminal runs, and reports stale_session_route_policy, stale_schedule_route, or stale_run_route diagnostics with the missing route id. The option requires a running daemon and is rejected with --routes-file.
doctor routes --canary submits a tiny real run against the selected running route inventory, using the normal session/run path with the route id and model pinned on the run. It is opt-in because it reaches the provider and may spend tokens. Use --route <route_id> to limit the check, and --canary-timeout-ms <ms> to bound each run. Canary rows report passed, failed, or timeout; failures include the canary session/run ids so operators can inspect the exact run. Canary mode is rejected with --routes-file because static TOML validation cannot prove that the currently running daemon can reach a provider/model/base URL.
Run submission also checks the selected route readiness after route/model resolution and before persisting the run. If the effective route is already in an error state, such as a missing, mismatched, or expired auth_ref, the daemon rejects the request with application/problem+json, domain routes, code route_not_ready, and does not enqueue provider work. Scheduled dispatch uses the same guard, and persisted schedules pointing at a route removed after restart fail before a run is persisted. The daemon rechecks the pinned route immediately before run execution as well, so queued or waiting runs restored after restart fail on the missing pinned route instead of drifting to the active route. Warning states, such as credentials expiring soon, remain allowed.
GET /v1/capabilities
This is the coarse daemon feature advertisement.
Current fields include:
control_plane_versionapprovalssidechainsmailboxessession_eventsrestart_restorelive_eventsapi_revisionroute_capability_matrix_versionsse_replaytyped_sse_heartbeatopenapiproblem_detailscursor_paginationpaginated_listsdomain_errorsagent_supervisor_auditspawn_policies
OpenAPI and Error Contract
GET /v1/openapi.json returns the daemon HTTP contract as OpenAPI 3.1. The spec covers the control-plane routes, connector ingress routes, observation ingress routes, and health probes, including path/query/header parameters, security scheme selection, SSE content type, pagination controls, and the shared ProblemDetails response. Connector ingress security is documented by mechanism, including replay-protection timestamp headers for HMAC-style signatures; individual connector configs can explicitly enable unauthenticated ingress, but the OpenAPI contract does not advertise unauthenticated access as the default.
Daemon HTTP surfaces normalize non-success API errors to application/problem+json with stable code and optional domain. Control-plane errors use feature domains such as sessions, runs, runtime, assets, or pagination; observation upload and capture heartbeat ingress errors use observation_ingress; connector ingress errors use connector_ingress with connector-specific codes for rate-limit responses. Connector-specific success acknowledgements, such as Slack ignores or external-protocol per-item rejections inside a 200 batch response, keep their protocol JSON shape.
Cursor pagination is opt-in with page=true or cursor. Supplying only limit preserves the legacy JSON array shape while applying the bounded limit and the same limit=0 validation. limit values above the daemon maximum are clamped to the advertised maximum.
The CLI mirrors this contract: --limit alone prints the legacy array shape, while --page or --cursor prints the paginated envelope.
GET /v1/runtime
This is the main operator snapshot for a live daemon.
Important fields:
default_route: the daemon-wide fallback route when no session or run override appliesroute_id: the active default route identifierproviderandmodel: compatibility fields derived from the active default routeroutes: the full daemon route inventorylearning_policyrun_memory_policypermission_modesystem_prompthooksdebug_leveldebug_capture: effective capture policy loaded from environment, including TTL/GC intervals, artifact/run/store budgets, encryption status/key id, and redaction token source statusmcpskillsconfig: durable runtime-config metadata (revision,updated_at_ms,persisted,history_len,history_limit,store_path)
route_id, auth_ref, and capability flags:
matrix_versionmultimodal_inputnative_web_searchimage_generationimage_editaudio_generationtranscription
matrix_version is currently 2 and should match /v1/capabilities.route_capability_matrix_version for current daemons. A legacy route payload without this field deserializes as version 0.
Durable runtime config
The daemon serializes live mutations for the default route/model, permission mode, system prompt, hooks, debug level, learning policy, run-memory policy, and tool runtime limits intoruntime-config.json under the daemon state root. Successful writes append a monotonic revision and publish runtime_updated.
Runtime transactions gate runtime reads and route pinning while the commit phase applies and persists the new revision. New runs should pin either the previous durable route or the next durable route, never a partially applied mutation.
Mutation payloads for /model, /permission-mode, /system-prompt, /debug-level, /learning-policy, /run-memory-policy, and /tool-limits accept optional expected_revision. If the current revision differs, the daemon returns 409 application/problem+json with:
GET /v1/runtime/revisions returns the current revision plus retained history in descending revision order. Revision reads use the same runtime-config visibility gate as GET /v1/runtime, so operators do not observe revision history while a mutation is in its apply/persist window.
The daemon retains at most config.history_limit historical revisions plus the current revision. Older historical revisions are pruned when the limit is exceeded, so rollback targets must be present in GET /v1/runtime/revisions.
POST /v1/runtime/rollback accepts:
target_revision is optional; when omitted, the daemon restores the previous retained revision. Rollback appends a new revision with setting: "rollback" and rollback_of_revision set to the restored revision.
skip_hooks defaults to false. Set it to true only for operator recovery when a persisted config_change hook blocks normal runtime remediation. Forced rollback still appends a normal runtime-config revision and records source: "runtime_api_force".
If a config_change hook blocks a mutation, no runtime-config revision is appended and the daemon returns 409 with runtime/runtime_change_blocked.
Rollback to an unknown retained revision returns 404 with runtime/runtime_revision_not_found.
Hook definitions inside GET /v1/runtime and GET /v1/runtime/revisions are redacted because these are runtime summary surfaces. Use the admin-only GET /v1/runtime/hooks endpoint to inspect the current raw hook settings before making hook changes.
MCP snapshot
When MCP is enabled,mcp includes:
config_path: Codex-compatible MCP config path when one was loaded.selected_profiles: built-in MCP catalog profiles selected through--mcp-profileorKHEISH_MCP_PROFILES.servers: per-server snapshots.tool_names: daemon-global MCP helper and qualified MCP tool names before session/persona filtering.
serversource:codex_configorbuilt_in_catalogprofiles: built-in profile names that selected the server when it came from the catalogcatalog_entry_id: built-in catalog entry id when applicabletransportuses_credentialscredential_secret_refs: daemon auth-store refs used by the server, without secret valuesconnectedtoolserrorinstructions: truncated, warning-wrapped server-provided advisory text. Treat it as untrusted data, not an operator instruction.
Learning automation policy
GET /v1/runtime/learning-policy returns the daemon-owned LearningAutomationPolicyConfig.
POST /v1/runtime/learning-policy replaces the full learning automation policy, appends a durable runtime-config revision, and accepts optional expected_revision.
Wrapped payload:
expected_revision at the top level. A payload containing only expected_revision is rejected; use the wrapped policy object for an explicit reset.
Current top-level fields:
modecapturepublicationjudge
manual_onlyshadowenabled
modedefaults toshadowcapture.run_summary_candidatesdefaults totruecapture.semantic_candidates.enableddefaults tofalsecapture.semantic_candidates.max_candidates_per_rundefaults to2publication.default_actiondefaults tomanual_reviewpublication.allow_api_origin_active_publicationdefaults tofalsejudge.enableddefaults tofalse
Evidence note
- Code verified:
crates/kheish-mcp/src/manager.rs,crates/kheish-daemon/src/state.rs,crates/kheish-daemon/src/api/types.rs,crates/kheish-daemon/src/api/handlers.rs,crates/kheish-auth/src/types.rs,crates/kheish-auth/src/backends/mcp_oauth.rs. - CLI verified:
runtime get,runtime auth accounts list/get/refresh/revoke, andmcp oauth status/login/refresh/logout. - Daemon live tested: yes, using a fresh daemon with
--mcp-profile docsand the generic MCP OAuth true-binary protocol harness. - Provider-specific tested: no provider-specific model behavior is required for this control-plane snapshot.
POST /v1/runtime/learning-policyreplaces the full policy- send
modeexplicitly when mutating policy, because omitting that field is not the same thing as applying the effective runtime default
run_summary_candidatessemantic_candidates
semantic_candidates currently contains:
enabledmodeltimeout_msmax_candidates_per_run
timeout_msmust be greater than zero when providedmax_candidates_per_runmust be between1and8- semantic capture rejects secret-like
fact,preference, anddecisioncontent before candidate persistence
default_actionallow_api_origin_active_publicationquarantined_rule_namesrules
namescope_kindscope_idkindsensitivitymin_confidencerequire_evidencerequire_source_runrequire_source_sessionactionexpires_after_ms
manual_reviewrejectpublish_provisionalpublish_active
publication.default_actioncannot bepublish_active- a
publish_activerule must declare an explicitkind publish_activeis not supported forprocedurelearnings- automatic active publication escalates to
manual_reviewwhen the candidate conflicts with an active same-scope, same-kind learning with the same obvious subject - duplicate same-scope, same-kind prompt-visible learnings are reused instead of republished under a new learning id
quarantined_rule_namesmust be non-empty and unique
require_evidence,require_source_run, andrequire_source_sessiononly count as trusted rule inputs for daemon-owned candidates- this trusted-input rule only affects rule matching; API-created candidates can still auto-publish when another rule matches, but automatic
activepublication is still subject toallow_api_origin_active_publicationand daemon-owned verification
active rule:
- API-origin candidates are downgraded from automatic
publish_activetopublish_provisionalunlessallow_api_origin_active_publication=true publish_activestill requires daemon-owned verification before prompt visibility
enabledmodeltimeout_ms
manual_review in enabled mode when execution fails.
Example:
Run memory policy
GET /v1/runtime/run-memory-policy returns the daemon-owned recovered run-memory policy.
POST /v1/runtime/run-memory-policy replaces the full recovered run-memory policy. The policy is persisted through the runtime-config revision stream, restored on daemon restart, and immediately reapplies retention/overflow pruning to persisted run-memory records.
New clients should send a wrapped payload with an optional compare-and-swap guard:
expected_revision at the top level. Empty payloads, unknown-only payloads, and payloads containing only expected_revision are rejected instead of resetting the policy to defaults.
Invalid policy limits return 400 with runtime/invalid_run_memory_policy.
The runtime-config revision commit is authoritative. If follow-up pruning or index maintenance fails after a successful commit, the endpoint still returns the committed runtime snapshot and records the maintenance failure under /v1/status.run_memory.maintenance.
Current fields:
enabledretention_msmax_tracked_per_sessionmax_prompt_entriesredact_piisearch_visibility
enabled=trueretention_ms=2592000000max_tracked_per_session=32max_prompt_entries=3redact_pii=truesearch_visibility=session_only
- when
enabled=true,retention_ms,max_tracked_per_session, andmax_prompt_entriesmust be greater than zero max_prompt_entriesmust not exceedmax_tracked_per_session
- disabled policy prevents new run-memory records from being stored and removes existing records as runs complete
retention_msis enforced during boot rebuild, policy changes, storage, memory-context projection, memory-search, and prompt recoverymax_tracked_per_sessioncontrols durable per-session overflow pruningmax_prompt_entriescontrols the candidate set considered for prompt injection before final model-budget packingsearch_visibility=session_onlylimits recovered-runmemory-searchresults to the requested sessionsearch_visibility=learning_scopesopts recovered-runmemory-searchinto the session’s visible learning scopes/v1/status.run_memory.metrics.prompt_limit_omitted_totalincludes both daemon-sidemax_prompt_entriesomissions and final runtime prompt-budget omissions/v1/status.run_memory.metrics.injected_totalcounts recovered-memory entries kept after final runtime prompt packing, not merely candidate entries attached to run metadata/v1/status.run_memory.maintenanceexposes the last bounded startup or runtime-policy maintenance report, including prune counts, scan/prune error counts, and bounded diagnosticsredact_pii=truescrubs common PII and secret/token shapes before run-memory records are persisted
Change the default route
POST /v1/runtime/model changes the daemon default route. It does not rewrite session route policies, queued runs, active runs, or already suspended runs.
Request body:
provider: optional route identifier such asopenai,anthropic, oropenroutermodel: required backend model string
- On a named-route daemon,
provideris the daemon route id. - The concrete model stays in
model.
- request
providerselects a configured route id - secret-slot
providernames the underlying auth/backend family
Secret slots
The runtime secret surface stores daemon-managed auth material. Read endpoints returnAuthSlotStatus, not raw secret values.
AuthSlotStatus fields:
slot_idprovidermodesummaryupdated_at_msdetails: backend-specific redacted metadata, such as expiry, source, issuer, resource, scopes, or last refresh outcome
POST /v1/runtime/secrets
This endpoint accepts a full AuthSlotRecord:
slot_idprovidermodestateupdated_at_ms
Generic opaque secret example
Useful for connector secret references and MCP token slots such asmcp.linear.LINEAR_API_KEY.
KHEISH_AUTH_STORE_MASTER_KEY or KHEISH_AUTH_STORE_MASTER_KEY_FILE. MCP and connector token slots should use provider: "generic" with mode: "opaque_secret" unless you are writing a provider route key such as OpenAI or Anthropic.
For MCP, a stored secret is useful only when a loaded built-in catalog entry or explicit MCP config references that mcp.* slot. Built-in catalog slots can be inspected with mcp auth slots <entry-id>, and explicit MCP config can reference slots through bearer_token_secret_ref, http_header_secret_refs, or env_secret_refs.
MCP inventory is loaded at daemon startup. After writing or rotating a secret used by an MCP server, restart the daemon so that server reconnects with the new value.
OpenAI API key example
- Anthropic:
{"kind":"api_key","api_key":"..."} - Google:
{"kind":"api_key","api_key":"..."} - OpenRouter:
{"kind":"api_key","api_key":"..."} - xAI:
{"kind":"api_key","api_key":"..."}
state is backend-specific and more verbose than simple API-key records.
OAuth account endpoints
These endpoints expose redacted account status and write-only MCP OAuth import. They never return tokens.GET /v1/runtime/auth/accounts: lists only slots wheremodeisoauth_account.GET /v1/runtime/auth/accounts/{slot_id}: returns one OAuth account status and rejects non-OAuth slots.POST /v1/runtime/auth/accounts/{slot_id}/refresh: forces one backend refresh and returns redacted status. The endpoint rejects non-OAuth slots.POST /v1/runtime/auth/accounts/{slot_id}/revokeandDELETE /v1/runtime/auth/accounts/{slot_id}: delete the local OAuth account after normal dependency checks.POST /v1/runtime/auth/accounts/mcp-oauth: stores one completed MCP OAuth authorization-code login. This is the endpoint used bykheish-daemon mcp oauth login.POST /v1/runtime/auth/accounts: compatibility alias for the same MCP OAuth import body. Prefer/v1/runtime/auth/accounts/mcp-oauthin new clients.
- slot ids must use the
mcp.namespace - resource, issuer, authorization endpoint, and token endpoint must use
https, except loopbackhttptest URLs - empty core fields are rejected
- refresh responses cannot add scopes that were not already approved
DELETE /v1/runtime/secrets/{secret_ref}returns{"accepted": true}when the slot was removed.- Deletion returns
409 Conflictwhen a runtime connector or loaded MCP server still references that slot.
Brokered runtime auth
Kheish exposes operator-facing inspection for the broker that resolves auth-backed route and connector access at execution time.Subject endpoints
GET /v1/runtime/auth/subjects/{subject_id}POST /v1/runtime/auth/subjects/{subject_id}/revoke
AuthSubjectStatus fields:
subject_idcurrent_epochrevokedactive_connector_lease_idsactive_route_lease_idsactive_mcp_lease_ids
session:{session_id}agent:{agent_id}connector:{connector_name}daemon
Lease endpoints
GET /v1/runtime/auth/leases/{lease_id}POST /v1/runtime/auth/leases/{lease_id}/revokePOST /v1/runtime/auth/slots/{slot_id}/revoke
CredentialLeaseStatus fields:
leaserevokedactive
lease payload includes:
idgrant_idsubject_idsubject_epochaudienceissued_at_msexpires_at_ms
token_digest; it remains only in broker state for validation.
Current lease audiences are:
{"type":"route","route_id":"openai","slot_id":"openai.prod"}{"type":"connector","connector":"slack-prod","env_keys":["BOT_TOKEN"]}{"type":"mcp_server","server":"acme","slot_id":"mcp.oauth.acme","scopes_hash":"..."}
Permission mode
POST /v1/runtime/permission-mode replaces the daemon-wide permission mode.
Request body:
defaultacceptEditsbypassPermissionsplandontAsk
System prompt settings
POST /v1/runtime/system-prompt replaces the current SystemPromptSettings.
Fields:
override_promptcustom_promptappend_promptlanguageoutput_style
Hooks
GET /v1/runtime/hooks returns the current raw HookSettings. When bearer auth is enabled, this endpoint requires an admin token. Use GET /v1/runtime or GET /v1/status for read-only runtime summaries with hook executor bodies redacted.
POST /v1/runtime/hooks replaces the full hook map and appends a durable runtime-config revision. The endpoint accepts either a legacy bare HookSettings object or a wrapped request:
expected_revision is optional. skip_hooks defaults to false and is intended only for operator recovery from a bad config_change hook revision.
A payload containing only expected_revision or skip_hooks is rejected; use the wrapped settings object for an explicit hook reset.
Hook settings are validated before persistence. Invalid hook names, empty executor inputs, zero or excessive timeouts, excessive retries, unsafe HTTP targets, and invalid agent turn limits return application/problem+json with domain: "runtime" and code: "invalid_hook_settings". Rejected hook updates do not change the active runtime revision.
Minimal example:
GET /v1/runtime/hooks/dead-letter returns the latest redacted hook dead-letter records for operator inspection. When bearer auth is enabled, this endpoint requires an admin token. The underlying store is pruned by count and bytes. POST /v1/runtime/hooks/dead-letter/{id}/resolve accepts a JSON body such as {"reason":"investigated"}, appends a redacted operator-resolution ledger entry, and returns the resolved record view. The same subsystem is summarized under GET /v1/status as status.hooks, including configured count, historical and unresolved dead-letter counts, last unresolved hook, retry/failure counters, and dead-letter persistence failures.
Tool runtime limits
GET /v1/runtime/tool-limits returns the current ToolRuntimeLimits.
POST /v1/runtime/tool-limits replaces the full tool-limit object and appends a durable runtime-config revision. The mutation affects future tool batches; an already running batch keeps the snapshot it started with.
Example:
400 with runtime/invalid_tool_runtime_limits.
Debug capture
POST /v1/runtime/debug-level replaces the daemon-wide debug capture level.
Example:
offonredactedfull
full is daemon-global and should only be enabled on isolated instances.
At redacted, audio transcription attachment blocks inside model/provider debug payloads are
replaced with digest/size summaries instead of raw transcript text.
Debug artifacts are persisted under the daemon state root and are protected by the debug store:
- retained artifact summaries expose plaintext byte counts and SHA-256 checksums
- artifacts are truncated when they exceed
KHEISH_DEBUG_MAX_ARTIFACT_BYTES - run artifact bodies are bounded by
KHEISH_DEBUG_MAX_RUN_BYTESandKHEISH_DEBUG_MAX_ARTIFACTS_PER_RUN KHEISH_DEBUG_MAX_STORE_BYTEScan cap the whole debug store by pruning the oldest terminal or orphaned bundlesKHEISH_DEBUG_TTL_MSapplies automatic retention to stale terminal and orphaned debug bundles; known non-terminal runs are protected so resumable runs keep their evidenceKHEISH_DEBUG_GC_INTERVAL_MScontrols the periodic retention worker interval- artifact bodies and manifests are synced through atomic temp-file replacement before they become visible to the debug API
KHEISH_DEBUG_CAPTURE_KEY or KHEISH_DEBUG_CAPTURE_KEY_FILE to a 32-byte key to encrypt
debug artifact bodies at rest. The debug API and CLI decrypt artifacts when the same key is
available. The two key environment variables are mutually exclusive. Encrypted artifacts include a
non-secret key id so operators can identify key mismatches. If a configured key is invalid, enabling
any non-off debug level is rejected with 400 application/problem+json and code
debug_capture_key_invalid so the daemon does not silently lose evidence.
Set KHEISH_DEBUG_REDACT_TOKENS or KHEISH_DEBUG_REDACT_TOKENS_FILE for comma/newline-separated
literal tokens that should be scrubbed from debug artifacts in addition to the built-in credential
redaction rules. If the token file cannot be read, redacted/full capture enablement is rejected;
if the file disappears while capture is already enabled, affected text payloads fail closed to a
redaction marker instead of being persisted raw.
Some media/debug providers can emit repeated artifacts without a turn/attempt number. In that case
the first artifact keeps the base id and later artifacts receive stable timestamp suffixes; discover
the full list from GET /v1/runs/{run_id}/debug before fetching artifact bodies.
Runtime connectors
Detailed connector payloads and ingress behavior are documented in Connectors API. The important runtime behavior is:GET /v1/runtime/connectorsreturns the full connector inventoryGET /v1/runtime/connectors/{kind}/{name}returns one projected connector viewPUT /v1/runtime/connectors/{kind}/{name}creates or updates one daemon-managed connectorDELETE /v1/runtime/connectors/{kind}/{name}removes one daemon-managed connector after dependency checks
externaltelegramslackhttp
- The connector
PUTsurface behaves like a field-aware upsert. - Fields present in the JSON payload are applied.
- Fields omitted from the payload keep their current stored value.
PUT here is not a blind full-record replacement.
External connector runtime metrics are exposed at GET /v1/runtime/connectors/external/metrics.
Daemon-wide event stream
GET /v1/events/stream exposes the daemon-wide SSE stream used by control-plane observers.
id; typed heartbeat keepalives
are id-less and do not advance the reconnect cursor. Reconnect with the standard Last-Event-ID
header or ?cursor=<event_id> to replay buffered events with larger ids.
When both are supplied, the daemon parses both as event ids and uses the larger cursor so an older query string cannot
rewind a browser-managed reconnect. Event ids are seeded from a state-root epoch on startup,
so a reconnect cursor from before a daemon restart cannot mask fresh post-restart events.
The in-memory replay window is bounded; if a cursor is older than retained history or a slow
consumer falls behind the bounded live channel, the stream emits a typed stream_gap event with
skipped, reason, scope, skipped_is_estimate, and optional resume_after_id fields.
resume_after_id is present only when the daemon can provide a safe replay cursor; the gap frame
uses the same value as its SSE id only when doing so would advance the current stream cursor.
After a daemon restart, a cursor from a previous event-id epoch also receives stream_gap because
replay history is process-local; in that case skipped is a conservative id-range count and
clients should reconcile through the list/get endpoints before resuming from resume_after_id.
Keepalive frames are typed heartbeat events with JSON payload { "type": "heartbeat" }. The
daemon retains bounded scoped-loss metadata for filtered streams; if a filtered client falls
behind beyond that metadata window, it receives a conservative stream_gap instead of a silent
miss. /v1/status.events exposes the current replay window, cursor-gap count, live stream lag
count, event ids serialized as decimal strings, and tail_event_id_cursor as the safe current-tail
cursor for clients that want to connect from the status snapshot point. Tune the replay/live buffer
with --event-history-capacity or KHEISH_EVENT_HISTORY_CAPACITY; the daemon clamps unsafe zero or
excessive capacities to the supported 1..=262144 event range. The CLI streaming commands cap
individual SSE frames at 1 MiB to avoid unbounded buffering; use list/get endpoints when
reconciling very large outputs.
The global stream accepts optional session_id and run_id query filters. Session and run
stream endpoints are filtered views over the same daemon event bus:
tracesession_state_changedsession_snapshotoutputrun_updatedsession_goal_updatedinterruptedruntime_updatedheartbeatstream_gap
GET /v1/runtime.