Skip to main content

Observations and derivations API

Kheish exposes a dedicated observation plane for durable captured media and a derivation plane for deterministic secondary artifacts such as canonical text or visual previews.

Endpoint inventory

Observation source management:
  • GET /v1/observation-sources
  • POST /v1/observation-sources
  • GET /v1/observation-sources/{source_id}
Observation upload and listing:
  • POST /v1/observation-sources/{source_id}/observations
  • GET /v1/observations
  • GET /v1/observations/{observation_id}
Observation materialization:
  • POST /v1/observation-materializations
Derivations:
  • GET /v1/derivations
  • POST /v1/derivations
  • GET /v1/derivations/{derivation_id}

Observation sources

POST /v1/observation-sources accepts:
  • source_id
  • display_name
  • kind
  • upload_token
  • sensitivity
  • retention_seconds
  • max_active_observations
  • max_active_bytes
  • allow_materialization
  • allow_output_delivery
Defaults:
  • sensitivity: sensitive
  • retention_seconds: 7 days
  • max_active_observations: 512
  • max_active_bytes: 512 MiB
  • allow_materialization: true
  • allow_output_delivery: false
source_id is optional:
  • when you provide it, the daemon uses it as the stable source identifier
  • when you omit it, the daemon generates one server-side
Example:
{
  "source_id": "screen-main",
  "display_name": "Main screen snapshots",
  "kind": "screen_snapshot",
  "upload_token": "screen-upload-token",
  "sensitivity": "sensitive",
  "retention_seconds": 86400,
  "max_active_observations": 100,
  "max_active_bytes": 104857600,
  "allow_materialization": true,
  "allow_output_delivery": false
}
Important note:
  • the returned ObservationSourceView does not expose the upload token
  • callers must keep the token client-side after source creation or rotation
Supported source kinds and accepted media types:
  • screen_snapshot
    • image/png
    • image/jpeg
  • webcam_snapshot
    • image/png
    • image/jpeg
  • microphone_segment
    • audio/wav
    • audio/webm
GET /v1/observation-sources supports:
  • query: substring filter over source id or display name

Upload an observation

The upload route is intentionally outside normal admin auth:
  • POST /v1/observation-sources/{source_id}/observations
Authentication:
  • send Authorization: Bearer <upload_token>
  • the token is validated against the source-scoped upload secret
Example:
curl -X POST http://127.0.0.1:4000/v1/observation-sources/screen-main/observations \
  -H 'Authorization: Bearer screen-upload-token' \
  -H 'Content-Type: application/json' \
  -d '{
    "upload": {
      "file_name": "frame-001.png",
      "media_type": "image/png",
      "content_base64": "iVBORw0KGgoAAAANSUhEUgAAAAEAAAAB..."
    },
    "idempotency_key": "screen-main:frame-001",
    "captured_at_ms": 1760000000000,
    "stream_id": "call-7",
    "seq_no": 1,
    "canonical_text": "OCR text captured from the screen",
    "metadata": {
      "window": "editor"
    }
  }'
Fields:
  • upload.file_name
  • upload.media_type
  • upload.content_base64
  • idempotency_key
  • captured_at_ms
  • stream_id
  • seq_no
  • canonical_text
  • metadata
Idempotency behavior:
  • the stable key is (source_id, idempotency_key)
  • if the same request fingerprint is replayed, the daemon returns the existing observation
  • if the same key is reused with different payload content, the request is rejected

Observation listing

GET /v1/observations supports:
  • source_id
  • stream_id
  • after_ms
  • before_ms
  • include_purged
Filter behavior:
  • stream_id requires source_id
  • combining source_id and stream_id narrows the result set to one source stream
ObservationView includes:
  • observation_id
  • source_id
  • kind
  • sensitivity
  • retention_state
  • asset_id
  • canonical_text_asset_id
  • media_type
  • sha256
  • byte_length
  • captured_at_ms
  • received_at_ms
  • stream_id
  • seq_no
  • idempotency_key
  • request_fingerprint
  • metadata

Materialize observations into a run

POST /v1/observation-materializations submits a normal daemon run after augmenting an input request with one observation selection. Request fields:
  • target_session_id
  • selection
  • request
  • include_raw_assets
  • raw_asset_policy
  • fail_when_empty
Important note:
  • request is a full SubmitInputRequest
  • when that nested request uses only input_items, include "content": "" to match the current request shape
Defaults and validation:
  • include_raw_assets defaults to true
  • fail_when_empty defaults to true
  • latest_from_source and latest_from_stream default max_observations to 3
  • observation_ids must contain at least one identifier
  • max_observations must be greater than zero

Selection variants

By ids:
{
  "type": "observation_ids",
  "observation_ids": ["observation-1", "observation-2"]
}
Latest from a source:
{
  "type": "latest_from_source",
  "source_id": "screen-main",
  "max_observations": 3,
  "lookback_seconds": 600
}
Latest from a stream:
{
  "type": "latest_from_stream",
  "source_id": "screen-main",
  "stream_id": "call-7",
  "max_observations": 3,
  "lookback_seconds": 600
}
Example request:
{
  "target_session_id": "incident-review",
  "selection": {
    "type": "latest_from_stream",
    "source_id": "screen-main",
    "stream_id": "call-7",
    "max_observations": 2,
    "lookback_seconds": 300
  },
  "request": {
    "content": "Analyze the latest observations and summarize what changed.",
    "generation": {
      "model": "gpt-5.4",
      "tool_choice": "auto",
      "allow_parallel_tool_calls": true,
      "response_format": {
        "type": "text"
      }
    }
  },
  "raw_asset_policy": "auto",
  "fail_when_empty": true
}
Raw asset behavior:
  • include_raw_assets is a legacy boolean fallback
  • raw_asset_policy is the preferred explicit control:
    • auto
    • never
    • always

Derivations

Derivations are deterministic daemon-owned transforms over assets, observations, or persisted session input. GET /v1/derivations supports:
  • query: substring match over derivation id, profile, subject, or result asset id

Create a derivation

POST /v1/derivations accepts:
  • profile
  • subject
Supported profiles:
  • canonical_text
  • visual_preview
Audio notes:
  • canonical_text on microphone observations reuses uploader-supplied canonical text when it already exists
  • otherwise, when the daemon was started with a transcription backend, canonical_text performs daemon-owned speech-to-text and stores the result as one daemon-owned text/plain asset
  • built-in transcription backends currently include OpenAI and OpenRouter
  • canonical-text derivation depends on the configured transcription backend, not only on the selected run route
  • observation-source uploads remain limited to audio/wav and audio/webm, but imported daemon assets and normal session attachments can also derive text from supported formats such as audio/mpeg, audio/mp4, and audio/m4a
Supported subjects:
  • asset:
{
  "type": "asset",
  "asset_id": "asset-1"
}
  • observation:
{
  "type": "observation",
  "observation_id": "observation-1"
}
  • session input:
{
  "type": "session_input",
  "session_id": "demo",
  "offset": 42
}
Example:
{
  "profile": "visual_preview",
  "subject": {
    "type": "asset",
    "asset_id": "asset-dxf-1"
  }
}
The returned DerivationView contains:
  • derivation_id
  • profile
  • subject
  • result_asset_id
  • reused_subject_asset
  • created_at_ms