Sessions and runs API
Sessions are the durable execution containers. Runs are individual execution attempts queued or executed inside those sessions. Channels are separate top-level public conversation resources. When a channel decides that one session should speak publicly, the daemon creates a normal run of kindChannelDelivery inside that session. Read Channels API for the public conversation surface.
Projects are also separate top-level daemon resources. Starting a project task still creates a normal run inside the assigned session rather than a separate project-local executor. Read Projects API for that coordination layer.
Endpoint inventory
Sessions:POST /v1/sessionsGET /v1/sessionsGET /v1/sessions/{session_id}GET /v1/sessions/{session_id}/memory-contextGET /v1/sessions/{session_id}/memory-searchGET /v1/sessions/{session_id}/skillsPOST /v1/sessions/{session_id}/personaPUT /v1/sessions/{session_id}/personaDELETE /v1/sessions/{session_id}/personaPOST /v1/sessions/{session_id}/capability-scopePUT /v1/sessions/{session_id}/capability-scopeDELETE /v1/sessions/{session_id}/capability-scopePOST /v1/sessions/{session_id}/credential-scopePUT /v1/sessions/{session_id}/credential-scopeDELETE /v1/sessions/{session_id}/credential-scopeGET /v1/sessions/{session_id}/reply-targetsPOST /v1/sessions/{session_id}/reply-targetsPUT /v1/sessions/{session_id}/reply-targetsDELETE /v1/sessions/{session_id}/reply-targetsPOST /v1/sessions/{session_id}/route-policyPUT /v1/sessions/{session_id}/route-policyDELETE /v1/sessions/{session_id}/route-policyGET /v1/sessions/{session_id}/eventsGET /v1/sessions/{session_id}/streamPOST /v1/sessions/{session_id}/inputPOST /v1/sessions/{session_id}/runsPOST /v1/sessions/{session_id}/interruptPOST /v1/sessions/{session_id}/end
GET /v1/runsGET /v1/runs/{run_id}GET /v1/runs/{run_id}/external-actionsGET /v1/runs/{run_id}/eventsGET /v1/runs/{run_id}/streamPOST /v1/runs/{run_id}/cancelGET /v1/runs/{run_id}/debugGET /v1/runs/{run_id}/debug/artifacts/{artifact_id}POST /v1/runs/prune
GET /v1/deliveriesGET /v1/deliveries/dead-letterGET /v1/deliveries/{delivery_id}POST /v1/deliveries/{delivery_id}/replay
Create or reuse a session
POST /v1/sessions accepts:
session_id: optional caller-selected identifierthread_id: optional provider-side thread idpersona_id: optional persona bound at creation timecapability_scope: optional session capability override persisted on the sessioncredential_scope: optional session credential override persisted on the session
- The handler currently returns
201 Created. - If the requested
session_idalready exists, Kheish reuses that session. - Caller-selected
session_idvalues must not be empty,.or..; dot-only path segments are reserved because URI clients normalize them before they reach the daemon. - A reuse request is rejected if the supplied
persona_idconflicts with the already bound persona. - A reuse request is also rejected if the supplied
capability_scopeconflicts with the already persisted session scope. - A reuse request is also rejected if the supplied
credential_scopeconflicts with the already persisted session scope.
GET /v1/sessions supports one query parameter:
persona_id: filter by currently bound persona snapshotpage=true: return a cursor-paginated envelope instead of the legacy arraycursor: continue from a previous paginated responselimit: bound the response size; withoutpage=truethis keeps the legacy array shape but still applies the limit and validateslimit=0
Session-scoped defaults
The daemon lets you persist session-local defaults that future runs inherit unless they are overridden per request.Route policy
POST and PUT on /v1/sessions/{session_id}/route-policy both persist the same SessionRoutePolicy shape:
provider: preferred daemon route idgeneration.modelgeneration.fallback_modelgeneration.tool_choicegeneration.allow_parallel_tool_callsgeneration.max_output_tokensgeneration.temperaturegeneration.response_format
- explicit run override on
SubmitInputRequest - persisted session
route_policy - daemon
default_route
- On a named-route daemon, the request
providerfield carries the route id, not necessarily the underlying provider family.
- request
providerselects a configured route id - backend responses and secret records may still use
providerto name the underlying provider family
Session capability scope
POST and PUT on /capability-scope both accept:
skill_allowskill_denymcp_server_allowmcp_server_denymcp_tool_allowmcp_tool_deny
- persona capability baseline, when the session has a bound persona
- restricted by the persisted session capability scope
Session credential scope
POST and PUT on /credential-scope both accept:
route_allowroute_denyconnector_allowconnector_denyconnector_credential_allowconnector_credential_denymcp_server_allowmcp_server_deny
CapabilityScope.
CapabilityScopedecides what the session can see or callCredentialScopedecides which auth-backed routes, connector credentials, and credentialed MCP surfaces can actually resolve
- entries are trimmed, sorted, and deduplicated before persistence
*collapses an allow-list or deny-list to that wildcard alone- if you scope connectors with
connector_alloworconnector_denyand leaveconnector_credential_allowempty, concrete connector credential env keys default to denied
- persona, capability-scope, and credential-scope changes are only allowed while the session is idle for topology mutation
- non-idle credential-scope mutation is returned as
409 Conflict
Inspect effective memory and visible skills
Kheish exposes derived session-inspection endpoints that show what the daemon currently considers eligible for the next run before final prompt packing.Session memory context
GET /v1/sessions/{session_id}/memory-context returns a SessionMemoryContextView.
Optional query parameter:
query: preview learned-context and recovered-memory ordering for a specific pending input without starting a run
session_ideffective_capability_scopelearning_scopeslearned_contextrecovered_memoryvisible_skills
- which semantic learnings are currently prompt-eligible
- which recovered run memories are currently eligible for recovery
- which skills are currently visible to the session
learned_context or recovered_memory later when it packs the final prompt for one specific input and model budget. A real run with pending input can rank both learned-context entries and recovered run-memory entries against that input before prompt projection.
Current projection rules worth remembering:
procedureandrun_summarylearnings stay out oflearned_context- learnings marked
sensitivity=sensitivestay out oflearned_contextandmemory-search - when a query is available and at least one prompt-eligible learning scores above zero, zero-score learnings are omitted from
learned_context; if nothing scores, the daemon keeps the recency/scope fallback - learnings with
publish_tier=provisionalstay out oflearned_context - learnings with
verification_status=failedstay out oflearned_context - automatically published learnings stay out of
learned_contextuntilverification_status=verified - learnings with
policy_decision=escalatedstay out oflearned_context - promoted procedural skills only appear in
visible_skillsafter their promoted-skill rollout state becomesactive
Session memory search
GET /v1/sessions/{session_id}/memory-search returns a bounded SessionMemorySearchView.
Supported query parameters:
querylimit
- when
queryis omitted, the daemon returns a recent browse view - when
queryis present, the daemon lexically ranks visible learnings, recovered runs, and visible skills; learning scores include deterministic query-term matching against Unicode content/kind/scope fields, and recovered-run scores include deterministic query-term matching against sanitized run-memory summaries/previews - recovered-run search is session-only by default; set runtime
run_memory_policy.search_visibility=learning_scopesto opt in to searching recovered runs visible through the session’s learning scopes - returned result kinds are
learning,recovered_run, andskill - learnings marked
sensitivity=sensitiveare excluded from session memory-search - skills only appear in the search results when a query is present
- omitted
limitdefaults to12 - the daemon clamps
limitto a maximum of50 - expired or unreadable recovered-run records are pruned instead of being returned
memory-contextshows the current eligible automatic projectionmemory-searchshows the broader daemon-owned memory browse/search surface visible to that session, but recovered-run results keep a session-local boundary unlessrun_memory_policy.search_visibility=learning_scopes
source_idtitleexcerptscoretimestamp_msscopeprompt_eligiblematched_fields
Session-visible skills
GET /v1/sessions/{session_id}/skills returns the skills currently visible to that session.
Supported query parameter:
query
- effective session capability scope
- promoted-skill source-scope visibility
Session reply targets
Structured session reply-target requests are accepted on:POST /v1/sessions/{session_id}/reply-targetsPUT /v1/sessions/{session_id}/reply-targets
rawexternaltelegramslackhttp
GET /v1/sessions/{session_id}/reply-targets returns a SessionReplyTargetsView envelope:
reply_targets array is normalized ReplyHandle data, not the original structured request shape.
Persona binding
Session persona mutation accepts:POST /v1/sessions/{session_id}/personaPUT /v1/sessions/{session_id}/personaDELETE /v1/sessions/{session_id}/persona
Submit work
Kheish intentionally exposes two session submission modes:POST /v1/sessions/{session_id}/input- executes inline and returns an updated
SessionView
- executes inline and returns an updated
POST /v1/sessions/{session_id}/runs- queues detached work and returns
202 Acceptedwith aRunView
- queues detached work and returns
/input when the client wants the session view after execution. Use /runs when the client wants an explicit run handle for later polling or streaming.
/input is accepted only while the session has no active or queued run. If work is already in progress, the daemon returns 409 application/problem+json with domain sessions and code session_busy. Use /runs for detached submissions that may queue behind existing work.
SubmitInputRequest
Request fields:
provider: optional run-scoped route overridesource_pluginsource_kindactor_idcontentinput_itemsattachmentsgenerationcompletion_requirementsmetadatabinding_keysreply_targetsreply_pluginreply_address
input_itemscannot be combined with compatibilitycontentorattachmentscontentdefaults to an empty string, so orderedinput_itemsrequests can omit itcontent,attachments, orinput_itemsmust provide actual inputbinding_keysare durable session-affinity keys remembered by the daemon- one binding key cannot later be rebound to a different session
reply_targetsaccepts either normalizedReplyHandleobjects or the same structured variants used by session reply-target defaultsreply_pluginandreply_addressare compatibility fields; preferreply_targets
file_name and non-empty content_base64.
Generic input/output schema
input_items is the preferred provider-neutral input surface. Supported item variants:
{"type": "text", "text": "..."}{"type": "asset_reference", "asset_id": "asset-1"}{"type": "board_reference", "board_id": "board-1", "revision_id": "rev-1"}{"type": "inline_asset", "file_name": "note.txt", "media_type": "text/plain", "content_base64": "..."}
DaemonOutputRecord entries. Important fields:
session_id,run_id,plugin, andaddresscontent, the plain text fallbackparts, the structured output partsartifacts, the daemon-owned artifact references emitted with the outputsource_kind, one ofassistant_text,emit_output, ordaemon_emit_output
parts use this shape:
emit_output tool is for explicit structured output and accepts content, parts, artifact_ids, and include_artifacts_inline. Tool input asset parts are normalized into persisted attachment parts. Event timestamps live on RunEventEntry.timestamp_ms, not on DaemonOutputRecord.
Content plus attachments example
Ordered multimodal input example
Completion requirements
Current daemon-enforced completion requirements are explicit and opt-in. Supported shape today:Inspect session state
GET /v1/sessions/{session_id} returns a SessionView containing:
session_idagent_idsnapshotroute_policycapability_scopeeffective_capability_scopecredential_scopeeffective_credential_scopepersonareply_targetsoutputs
GET /v1/sessions/{session_id}/events returns SessionEventLogView:
sessiondaemon_outputsrun_events
GET /v1/sessions/{session_id}/stream exposes the session SSE stream:
?cursor=<event_id>
and browser Last-Event-ID replay with decimal-string event ids, emits id-less typed heartbeat
keepalives, and emits stream_gap with skipped, reason, scope, skipped_is_estimate, and
optional resume_after_id when the requested cursor is older than the bounded replay window,
predates the current daemon event-id epoch after restart, or the client falls behind the live
channel. Filtered-stream scoped loss metadata is bounded; if the client falls behind beyond that
metadata window, the stream emits a conservative gap. CLI stream commands abort individual SSE
frames above 1 MiB, so use list/get endpoints for bulk output reconciliation.
Session interrupt and end
POST /v1/sessions/{session_id}/interrupt returns:
interruptedsnapshot
POST /v1/sessions/{session_id}/end accepts:
Run inspection
GET /v1/runs supports these query parameters:
session_idlimit, clamped to at most100priority_active, which sorts active or pending work first when true
inputscheduled_inputobservation_materializationscheduled_observation_materializationmailbox_deliverychannel_deliveryparent_clarificationapproval_resumeuser_question_resume
queuedrunningwaiting_for_approvalwaiting_for_user_questioncompletedfailedinterruptedcancelled
queued -> running,
running -> waiting_for_approval|waiting_for_user_question|completed|failed|interrupted|cancelled,
and waiting runs may resume to running or settle terminally. running -> queued is reserved for
daemon restart recovery of replayable daemon-owned work; ordinary API mutations that attempt to
rewind lifecycle state return 409 application/problem+json with domain runs and code
run_state_conflict.
If restart repair finds more than one wait-capable run in a session, the first durable waiting run
keeps the active slot. Extra waiting runs are requeued only when their payload is daemon-owned and
safe to replay, such as scheduled, mailbox, observation, or parent-clarification work. Extra normal
input runs are interrupted instead of replaying stale user/provider state.
RunView includes:
run_idsession_idagent_idkindstatussubmitted_at_msupdated_at_msstarted_at_msfinished_at_msqueued_positionrequestinput_attachmentsinput_metadatapending_approval_idspending_approvalspending_question_idspending_questionsoutputsdeliverieserror
RunView.request is a compact RunRequestSummary:
source_pluginsource_kindactor_idtext_previewprovider, the resolved route selector used for the runmodel, the resolved model used for the runapproval_countquestion_count
GET /v1/runs/{run_id} as the source of truth for the route and model actually used by one execution.
Output delivery inspection
Connector-backed output delivery is stored in the daemon delivery queue. Operator views are redacted: they expose the plugin, a target digest, attempts, safe error codes, timestamps, counts, and associatedrun_id, but not raw reply addresses or payload metadata.
GET /v1/deliveries supports:
session_idrun_idpluginstatus:pending,retrying,delivered, ordead_letteredpage,limit,cursor
GET /v1/deliveries/dead-letter is the same list filtered to dead_lettered.
POST /v1/deliveries/{delivery_id}/replay replays one dead-lettered delivery by creating a new pending delivery with a new delivery_id and replayed_from_delivery_id pointing at the original dead letter. The original DLQ record remains as audit history. Replays are idempotent by default; a later non-forced replay returns the existing pending, delivered, or dead-lettered replay. Use force=true only when an intentional second replay is needed.
status.delivery.dead_lettered is historical. status.delivery.unresolved_dead_lettered counts DLQ entries that do not yet have a completed replay and is the count used for the delivery health warning.
RunView.deliveries embeds the same redacted delivery views for the run, so runs get <run_id> shows whether output is still pending, retrying, delivered, or dead-lettered.
Persisted run event variants:
acceptedqueuedstartedwaiting_for_approvalapproval_resolvedwaiting_for_user_questionuser_question_resolvedparent_clarification_resolvedoutputcompletedfailedinterruptedcancelled
accepted,queued,started,completed,interrupted, andcancelledinclude the current run projection.waiting_for_approvalincludes pending approval ids, and new events also include the full pending approval request payloads underrequestsfor audit. Older persisted events may contain ids only.approval_resolvedincludes the exact resolution payloads that resumed the run, includingbehavior, optionalupdated_input,justification, and denialreason.waiting_for_user_questionincludes pending question ids and the run projection. New events also include the full pending structured question payloads underrequestsfor audit. Older persisted events may contain ids only.user_question_resolvedincludes the exact structured answer payload that resumed a normal waiting run.parent_clarification_resolvedincludes the requester child agent/session ids, the request id, decline flag, and exact structured resolution delivered to the child.outputincludes the appendedDaemonOutputRecordand the run projection.failedincludes the error string and the run projection.
Run external actions
GET /v1/runs/{run_id}/external-actions returns the signed external-action audit records attached to that run.
Important fields include:
action_idtimestamp_mssession_idagent_idrun_idtool_call_idprincipal_idparent_principal_idgrant_idphasekindtargetrequest_digestresponse_digestoutcomeprev_hashrecord_hashsignature_algkey_idsignature
Run debug surfaces
GET /v1/runs/{run_id}/debug returns one RunDebugView:
run_idlevelartifacts
artifact_idformatupdated_at_msbytes: on-disk bytesplaintext_bytes: retained UTF-8 payload bytes after any truncationsha256: checksum of the retained plaintext bytestruncatedoriginal_bytes: present when the artifact was truncated by the size budgetencrypted: true when the artifact is encrypted at rest
GET /v1/runs/{run_id}/debug/artifacts/{artifact_id} returns raw UTF-8 text, not a JSON envelope.
Useful artifact ids usually include request, provider event stream, and normalized response records such as:
turn-0001-attempt-0001-model-requestturn-0001-attempt-0001-provider-requestturn-0001-attempt-0001-provider-eventsturn-0001-attempt-0001-model-response
Run debug retention
POST /v1/runs/prune explicitly prunes heavy debug evidence for terminal runs older than older_than_ms.
The daemon also runs the configured debug TTL against stale terminal and orphaned debug bundles on
boot and periodically during serve. Known non-terminal runs are protected so resumable runs keep
their evidence.
This endpoint is intentionally non-destructive for the control-plane audit:
- run records remain available through
GET /v1/runs/{run_id} - run events remain available through
GET /v1/runs/{run_id}/events - only debug bundles returned by
GET /v1/runs/{run_id}/debugare cleared - session journals, deliveries, external-action audit records, and run-memory records are outside this debug-evidence prune
runs prune only
removes debug evidence; it is not run-history GC.
Request fields:
older_than_ms: required positive age thresholdsession_id: optional session scopelimit: optional positive maximum number of candidate debug bundles to prune or reportdry_run: when true, reports candidates without deleting debug files
dry_runnow_mscutoff_mslimitmatched_run_count: eligible terminal runs with debug bundles beforelimitis appliedcandidate_run_idscandidate_debug_bytespruned_debug_run_idspruned_debug_bytes
older_than_ms or limit return 400 application/problem+json with domain runs and code run_retention_invalid_request.
Default automatic debug retention is 7 days. Operators can override the debug store at daemon start:
KHEISH_DEBUG_TTL_MS: automatic debug retention TTL;0disables automatic TTL cleanupKHEISH_DEBUG_GC_INTERVAL_MS: periodic retention intervalKHEISH_DEBUG_MAX_ARTIFACT_BYTES: retained plaintext bytes per artifactKHEISH_DEBUG_MAX_RUN_BYTES: retained artifact-body bytes per runKHEISH_DEBUG_MAX_ARTIFACTS_PER_RUN: retained artifacts per runKHEISH_DEBUG_MAX_STORE_BYTES: optional global debug-store cap; periodic maintenance prunes the oldest terminal and orphaned bundles first, while protecting known non-terminal runs
KHEISH_DEBUG_CAPTURE_KEY or KHEISH_DEBUG_CAPTURE_KEY_FILE is configured but invalid, the
daemon rejects non-off debug capture before runs start so evidence is not silently dropped.
CLI:
Run events and cancellation
GET /v1/runs/{run_id}/events returns the persisted RunEventEntry[] list, not the SSE envelope format.
GET /v1/runs/{run_id}/stream exposes the run SSE stream:
cursor, Last-Event-ID, typed heartbeat, and stream_gap semantics as the daemon-wide stream.
POST /v1/runs/{run_id}/cancel returns the updated RunView. Cancelling an already-cancelled run
is naturally idempotent: the daemon returns the same terminal projection and does not append a
second cancelled lifecycle event.