Observations and derivations API

Kheish exposes a dedicated observation plane for durable captured media and a derivation plane for deterministic secondary artifacts such as canonical text or visual previews.

Endpoint inventory

Observation source management:

GET /v1/observation-sources
POST /v1/observation-sources
GET /v1/observation-sources/{source_id}

Observation upload and listing:

POST /v1/observation-sources/{source_id}/observations
GET /v1/observations
GET /v1/observations/{observation_id}

Observation materialization:

POST /v1/observation-materializations

Derivations:

GET /v1/derivations
POST /v1/derivations
GET /v1/derivations/{derivation_id}

Observation sources

POST /v1/observation-sources accepts:

source_id
display_name
kind
upload_token
sensitivity
retention_seconds
max_active_observations
max_active_bytes
allow_materialization
allow_output_delivery

Defaults:

sensitivity: sensitive
retention_seconds: 7 days
max_active_observations: 512
max_active_bytes: 512 MiB
allow_materialization: true
allow_output_delivery: false

source_id is optional:

when you provide it, the daemon uses it as the stable source identifier
when you omit it, the daemon generates one server-side

Example:

{
  "source_id": "screen-main",
  "display_name": "Main screen snapshots",
  "kind": "screen_snapshot",
  "upload_token": "screen-upload-token",
  "sensitivity": "sensitive",
  "retention_seconds": 86400,
  "max_active_observations": 100,
  "max_active_bytes": 104857600,
  "allow_materialization": true,
  "allow_output_delivery": false
}

Important note:

the returned ObservationSourceView does not expose the upload token
callers must keep the token client-side after source creation or rotation

Supported source kinds and accepted media types:

screen_snapshot
- image/png
- image/jpeg
webcam_snapshot
- image/png
- image/jpeg
microphone_segment
- audio/wav
- audio/webm

GET /v1/observation-sources supports:

query: substring filter over source id or display name

Upload an observation

The upload route is intentionally outside normal admin auth:

POST /v1/observation-sources/{source_id}/observations

Authentication:

send Authorization: Bearer <upload_token>
the token is validated against the source-scoped upload secret

Example:

curl -X POST http://127.0.0.1:4000/v1/observation-sources/screen-main/observations \
  -H 'Authorization: Bearer screen-upload-token' \
  -H 'Content-Type: application/json' \
  -d '{
    "upload": {
      "file_name": "frame-001.png",
      "media_type": "image/png",
      "content_base64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAAB..."
    },
    "idempotency_key": "screen-main:frame-001",
    "captured_at_ms": 1760000000000,
    "stream_id": "call-7",
    "seq_no": 1,
    "canonical_text": "OCR text captured from the screen",
    "metadata": {
      "window": "editor"
    }
  }'

Fields:

upload.file_name
upload.media_type
upload.content_base64
idempotency_key
captured_at_ms
stream_id
seq_no
canonical_text
metadata

Idempotency behavior:

the stable key is (source_id, idempotency_key)
if the same request fingerprint is replayed, the daemon returns the existing observation
if the same key is reused with different payload content, the request is rejected

Observation listing

GET /v1/observations supports:

source_id
stream_id
after_ms
before_ms
include_purged

Filter behavior:

stream_id requires source_id
combining source_id and stream_id narrows the result set to one source stream

ObservationView includes:

observation_id
source_id
kind
sensitivity
retention_state
asset_id
canonical_text_asset_id
media_type
sha256
byte_length
captured_at_ms
received_at_ms
stream_id
seq_no
idempotency_key
request_fingerprint
metadata

Materialize observations into a run

POST /v1/observation-materializations submits a normal daemon run after augmenting an input request with one observation selection. Request fields:

target_session_id
selection
request
include_raw_assets
raw_asset_policy
fail_when_empty

Important note:

request is a full SubmitInputRequest
when that nested request uses only input_items, include "content": "" to match the current request shape

Defaults and validation:

include_raw_assets defaults to true
fail_when_empty defaults to true
latest_from_source and latest_from_stream default max_observations to 3
observation_ids must contain at least one identifier
max_observations must be greater than zero

Selection variants

By ids:

{
  "type": "observation_ids",
  "observation_ids": ["observation-1", "observation-2"]
}

Latest from a source:

{
  "type": "latest_from_source",
  "source_id": "screen-main",
  "max_observations": 3,
  "lookback_seconds": 600
}

Latest from a stream:

{
  "type": "latest_from_stream",
  "source_id": "screen-main",
  "stream_id": "call-7",
  "max_observations": 3,
  "lookback_seconds": 600
}

Example request:

{
  "target_session_id": "incident-review",
  "selection": {
    "type": "latest_from_stream",
    "source_id": "screen-main",
    "stream_id": "call-7",
    "max_observations": 2,
    "lookback_seconds": 300
  },
  "request": {
    "content": "Analyze the latest observations and summarize what changed.",
    "generation": {
      "model": "gpt-5.4",
      "tool_choice": "auto",
      "allow_parallel_tool_calls": true,
      "response_format": {
        "type": "text"
      }
    }
  },
  "raw_asset_policy": "auto",
  "fail_when_empty": true
}

Raw asset behavior:

include_raw_assets is a legacy boolean fallback
raw_asset_policy is the preferred explicit control:
- auto
- never
- always

Derivations

Derivations are deterministic daemon-owned transforms over assets, observations, or persisted session input. GET /v1/derivations supports:

query: substring match over derivation id, profile, subject, or result asset id

Create a derivation

POST /v1/derivations accepts:

profile
subject

Supported profiles:

canonical_text
visual_preview

Audio notes:

canonical_text on microphone observations reuses uploader-supplied canonical text when it already exists
otherwise, when the daemon was started with a transcription backend, canonical_text performs daemon-owned speech-to-text and stores the result as one daemon-owned text/plain asset
built-in transcription backends currently include OpenAI and OpenRouter
canonical-text derivation depends on the configured transcription backend, not only on the selected run route
observation-source uploads remain limited to audio/wav and audio/webm, but imported daemon assets and normal session attachments can also derive text from supported formats such as audio/mpeg, audio/mp4, and audio/m4a

Supported subjects:

asset:

{
  "type": "asset",
  "asset_id": "asset-1"
}

observation:

{
  "type": "observation",
  "observation_id": "observation-1"
}

session input:

{
  "type": "session_input",
  "session_id": "demo",
  "offset": 42
}

Example:

{
  "profile": "visual_preview",
  "subject": {
    "type": "asset",
    "asset_id": "asset-dxf-1"
  }
}

The returned DerivationView contains:

derivation_id
profile
subject
result_asset_id
reused_subject_asset
created_at_ms

Home

Get started

Concepts

Runtime

Integrations

Operations

API

Operational reference

Contributing

Observations and derivations API

Observations and derivations API

Endpoint inventory

Observation sources

Upload an observation

Observation listing

Materialize observations into a run

Selection variants

Derivations

Create a derivation

Home

Get started

Concepts

Runtime

Integrations

Operations

API

Operational reference

Contributing

​Observations and derivations API

​Endpoint inventory

​Observation sources

​Upload an observation

​Observation listing

​Materialize observations into a run

​Selection variants

​Derivations

​Create a derivation

Observations and derivations API

Endpoint inventory

Observation sources

Upload an observation

Observation listing

Materialize observations into a run

Selection variants

Derivations

Create a derivation