Skip to main content

Output routing

Kheish separates internal output persistence from external delivery.

Local output

Every daemon-produced answer is first recorded locally in the daemon’s own output stream and session records. This ensures the output exists even if an external delivery fails.

Rich retained outputs

When a run creates daemon-owned media, the visible reply still needs one final persisted output step. Current pattern:
  1. call one generator such as generate_audio, generate_image, or edit_image
  2. keep the returned daemon asset ids
  3. call emit_output to publish the visible answer and attach those assets
Generated assets only become visibly inline or deliverable when the final emit_output includes asset parts or sets include_artifacts_inline = true.

Reply targets

Output routing is driven by reply targets. A run can inherit or override reply targets from:
  • the explicit request
  • the session defaults
  • connector-derived routing
When output is produced, Kheish resolves reply targets in this order:
  1. an explicit output override
  2. the run snapshot
  3. the session defaults
This is why session reply-target edits are prospective only. They affect future work and future fallback delivery, but they do not rewrite reply targets already captured by an existing run or delivery item.

Generic output plugins

External delivery is handled through a generic output plugin model. Output plugins receive normalized response envelopes and can deliver them to one or more external systems.

Queued delivery

Connector-backed output delivery is not purely synchronous. Kheish uses a persisted delivery queue with retries, so transient failures do not silently discard outbound responses. The delivery store keeps pending records, completed delivery audit records, and a dead-letter log. Operator API and CLI views are redacted and can be inspected with:
  • deliveries list
  • deliveries dead-letter
  • deliveries get <delivery_id>
  • deliveries replay <delivery_id>
  • deliveries resolve <delivery_id>
  • deliveries replay-bulk
  • deliveries reset-backpressure
runs get <run_id> also includes the redacted delivery state associated with that run. Dead-letter replay creates a new pending delivery and records replayed_from_delivery_id; the original dead-letter entry remains immutable audit history. resolve marks a dead-letter as operator-handled without redelivery. replay-bulk supports dry-run, filters, limits, and resolved-DLQ handling. reset-backpressure removes persisted per-target backpressure by redacted target digest or plugin, then wakes the worker. On restart, uncommitted .settled-* tombstones are restored to pending unless a completed or dead-letter audit entry was already committed. This avoids losing a delivery in the crash window between pending removal and terminal ledger append. Status keeps both historical and actionable DLQ counts:
  • dead_lettered: total persisted DLQ history
  • unresolved_dead_lettered: DLQ entries without a completed replay
Health warnings use unresolved_dead_lettered, so a successfully replayed delivery remains in audit history without keeping the daemon unhealthy forever. Retry timing is controlled by:
  • KHEISH_DELIVERY_INITIAL_RETRY_MS
  • KHEISH_DELIVERY_MAX_RETRY_MS
  • KHEISH_DELIVERY_MAX_RETRY_AFTER_MS
  • KHEISH_DELIVERY_MAX_ATTEMPTS
When a downstream plugin returns a Retry-After delay, the delivery queue honors it but caps it with KHEISH_DELIVERY_MAX_RETRY_AFTER_MS so one hostile or malformed target cannot sleep the queue indefinitely. HTTP and External connector outputs treat 429 as retryable even when the target omits Retry-After; the queue falls back to its normal retry policy. Delivery is at-least-once. A crash after the downstream target commits but before the local completed ledger is written can replay the same delivery. HTTP and External connectors send Idempotency-Key: kheish:<delivery_id>; other plugins document their own duplicate-suppression limits. Queue metrics are available at:
  • GET /v1/runtime/deliveries/metrics
Terminal ledgers are append-only audit files today. Large deployments should monitor status/metrics latency and plan retention or compaction around completed.jsonl, dead-letter.jsonl, and resolved-dead-letter.jsonl.

Current output surfaces

Today, the daemon can keep output local and route it through queued plugins for:
  • HTTP
  • Slack
  • Telegram
  • External connectors
This is why output routing is an operational subsystem, not a presentation detail. Read Connectors and reply targets for the durable model behind connector-owned routes, session defaults, and run snapshots.