Skip to main content

Providers and routing

Kheish distinguishes between a configured route and the underlying provider driver. A configured route has:
  • a stable route_id such as openai, anthropic, openrouter, or research
  • one underlying provider driver such as openai, anthropic, openrouter, google, or xai
  • one current model
  • coarse capabilities exposed to the daemon, such as multimodal input, native web search, image generation, and image editing
This distinction still matters because one daemon can expose multiple named routes that share the same driver. For example, a route named research can use the openrouter driver without being the same route as openrouter. Kheish currently supports these first-class drivers:
  • anthropic
  • google
  • openai
  • openrouter
  • xai

Daemon route inventory

One daemon can expose multiple routes at once. The runtime keeps a daemon-owned route inventory that includes:
  • a default route used when no more specific policy applies
  • one or more named routes such as openai, anthropic, openrouter, or research
  • the current model and underlying driver for each route
  • coarse route capabilities such as multimodal input, native web search, image generation support, and image editing support
This is why one daemon can run:
  • one session on Anthropic
  • another session on OpenAI
  • another session on a custom research route backed by OpenRouter
  • one child sidechain on a cheaper route
  • one image-editing run on one named route while text orchestration remains on another
without changing the daemon process itself.

Routes file

The recommended multi-route startup path is serve --routes-file .... The file format is TOML:
version = 1
default_route = "openrouter"

[routes.openrouter]
driver = "openrouter"
default_model = "openai/gpt-5.4-mini"
model_support = "any"
auth_ref = "openrouter.primary"

[routes.openai]
driver = "openai"
default_model = "gpt-5.4"
auth_ref = "openai.prod"
Populate those refs through the daemon-managed secret store before startup:
export KHEISH_AUTH_STORE_MASTER_KEY="$(./target/debug/kheish-daemon secrets generate)"

./target/debug/kheish-daemon secrets set openrouter.primary \
  --offline \
  --state-root .kheish-daemon-data \
  --provider openrouter \
  --from-env OPENROUTER_API_KEY

./target/debug/kheish-daemon secrets set openai.prod \
  --offline \
  --state-root .kheish-daemon-data \
  --provider openai \
  --from-env OPENAI_API_KEY
Generate this key once per persistent state_root and keep reusing it. Replacing it later makes existing encrypted secret slots unreadable. model_support = "any" is the normal choice for OpenRouter routes because OpenRouter model identifiers are typically vendor-prefixed, such as openai/gpt-5.4-mini or anthropic/claude-sonnet-4. The repository also ships routes.default.toml at the repository root as a daemon-managed baseline for the built-in anthropic and openai routes. Rules enforced by the daemon:
  • version must currently be 1
  • each route lives under [routes.<route_id>]
  • route_id must be non-empty, must not contain /, and must not contain whitespace
  • each route must define driver and default_model
  • if multiple routes are configured, default_route must be set in the file
  • serve --default-route ... can override a valid default_route from the file, but it does not currently make a missing multi-route default_route valid
The supported per-route fields are:
  • core route fields: driver, default_model
  • recommended secret reference: auth_ref
  • direct auth inputs: api_key, api_key_env
  • model compatibility policy: model_support = "family" or model_support = "any"
  • transport: base_url
  • OpenAI-only metadata: organization, organization_env, project, project_env
  • OpenAI-only auth fallback: openai_auth_source, openai_auth_file
  • Anthropic-only auth fallback and request tuning: anthropic_auth_source, anthropic_credentials_file, anthropic_version, anthropic_beta_headers
  • capability overrides: multimodal_input, native_web_search, image_generation, image_edit
Capabilities are route-level data. Two routes that use the same driver can still expose different capability flags. Route-auth behavior:
  • auth_ref, when present, must point at an existing daemon-managed secret slot
  • the daemon validates all configured auth_ref values at startup
  • runtime get returns auth_ref on default_route and each entry in routes when one is configured, but never the underlying secret bytes
  • one secret slot can be shared by multiple routes without duplicating credentials in the route file
  • OpenRouter routes are API-key-backed routes; they do not use OpenAI account auth imports such as Codex
Route selection and route authorization are related but different:
  • session or run routing decides which route_id should be used
  • the broker still resolves the actual credential material at request time
  • the execution’s effective CredentialScope can therefore deny one route even when the route exists in the daemon inventory and was selected successfully
Use CredentialScope.route_allow and CredentialScope.route_deny when you need to delegate route access explicitly to one session or one child sidechain.

Selector grammar

The daemon CLI normalizes route-aware model selectors on these commands:
  • runtime set-model
  • sessions input
  • sessions set-route
  • agents spawn-sidechain
  • schedules create
The selector grammar is:
  • gpt-5.4
  • <route_id>/<model>
The second form is route-aware. The first slash is the separator, so the model part may itself contain additional slashes. This makes selectors such as openrouter/openai/gpt-5.4-mini valid when the configured route id is openrouter. Important normalization rules:
  • the route prefix is recognized only when the segment before the first / matches a known daemon route id
  • otherwise the full value is treated as a raw model string
  • <route_id>/ without a model suffix is rejected
  • --fallback-model follows the same grammar as --model
  • if --provider and a selector prefix point at different routes, the CLI fails instead of guessing
  • when the selector contains a route prefix, that prefix is removed before the backend request is sent

Route precedence

Kheish resolves the effective route in this order:
  1. explicit run override
  2. persisted session route policy
  3. daemon default route
Once selected, the route is pinned for that run.

Session route policy

Sessions can now persist a route policy instead of relying only on per-run overrides. Use this when one session should keep targeting a specific route across later runs, resumes, schedules, or mailbox execution. Child sidechain sessions do not automatically inherit the parent session’s stored route policy. They only persist a route policy when the spawn request carries explicit route fields or an explicit route_policy. This behavior is intentionally different from session personas. A child sidechain session does inherit the parent session’s bound persona snapshot at spawn time. The control plane exposes this as a real session resource:
  • POST /v1/sessions/{session_id}/route-policy
  • PUT /v1/sessions/{session_id}/route-policy
  • DELETE /v1/sessions/{session_id}/route-policy
The CLI also exposes sessions set-route.

Run-scoped generation controls

Generation settings can be supplied with individual requests. Common controls include:
  • route id carried in the provider request field
  • model
  • fallback model
  • temperature
  • max output tokens
  • tool choice
  • response format
On a named-route daemon, the provider request field carries the selected route id after normalization. This keeps route selection explicit without forcing every session to change global defaults.

Capability-sensitive routing

Not every multimodal input requires the same model capability.
  • image inputs require a vision-capable route
  • supported document inputs can still execute on non-vision routes because the daemon renders bounded document text for the model
In the current route-capability surface, multimodal_input mainly covers image inputs and document-derived text or previews. Raw audio generation and speech-to-text are separate daemon features. This distinction matters operationally. A route that is correct for text and PDF summaries may still be wrong for PNG or JPEG inspection. Route capabilities are also surfaced through GET /v1/runtime, so operators and SDKs can inspect the daemon inventory before submitting work. Driver defaults today are:
  • Anthropic: multimodal input and native web search, but no daemon image backend
  • Google: multimodal input plus daemon image generation and image editing, but no native web_search
  • OpenAI: multimodal input, native web search, daemon image generation, and daemon image editing
  • OpenRouter: multimodal input plus daemon image generation and image editing, but no native web_search
  • xAI: multimodal input, native web search, and daemon image generation, but no daemon image editing
These are defaults, not a hard-coded public matrix. A routes file can override the exposed capabilities for one named route without changing another route that uses the same driver. Audio generation is tracked separately from these coarse route flags. generate_audio is surfaced by control-tool availability, not by a dedicated GET /v1/runtime capability field today. web_search remains one logical tool, but the daemon now prefers a provider-native backend when the effective route supports it. Current behavior:
  • Anthropic routes can use native provider web search
  • Google routes currently do not expose native provider web search through web_search
  • OpenAI routes can use native provider web search
  • OpenRouter routes currently do not expose native provider web search through web_search
  • xAI routes can use native provider web search
  • unsupported routes or unsupported request shapes fall back to the local DuckDuckGo HTML implementation
web_fetch remains daemon-local.

Image routing

The image tools are now selected by image route id, not only by provider family. Current behavior:
  • generate_image resolves the backend in this order: explicit tool route override, current run route when that route has an image backend, then the image service default route
  • edit_image resolves the backend in this order: explicit tool route override, current run route when that route supports editing, then the default image route when it supports editing, then the first configured edit-capable backend
  • the tool request field is still called provider for compatibility, but on a named-route daemon it means one configured image route id
  • the tool response field provider still reports the underlying backend provider such as openai or google
  • route-level image_generation and image_edit are independent capabilities; a route can expose generation without exposing editing

Audio routing

generate_audio follows the same route-first pattern as the image tools. Current behavior:
  • generate_audio resolves the backend in this order: explicit tool route override, current run route when that route has an audio backend, then the audio service default route
  • the tool request field is still called provider for compatibility, but on a named-route daemon it means one configured audio route id
  • the tool response field provider reports the underlying backend provider
  • the built-in audio-generation backend today is OpenRouter
  • audio generation currently appears through tool availability rather than one dedicated route capability flag in runtime get

Fallback behavior

Kheish can be configured with a primary route and an optional fallback model on that same resolved route. The routing layer does not pretend that all routes are interchangeable. --fallback-model uses the same selector grammar as --model, but after the primary route is resolved the fallback is validated against that route and stored as a normalized model string. Changing the daemon default route does not rewrite:
  • persisted session route policies
  • queued runs
  • active runs
  • restored suspended runs

Context-sensitive prompt recovery

Recovered run memory is packed against prompt budget before provider submission. In practice this means Kheish does not append recovered memory blindly. It estimates the current prompt size, reserves output tokens, applies a safety buffer, and then injects only the recovered-memory entries that still fit. If they do not fit, it omits them rather than forcing a context overflow. Read Recovered run memory for the daemon-side storage and retention model.

Operational guidance

Use explicit route and model settings in live validation and production-sensitive flows. This is especially useful when:
  • comparing route behavior on the same task
  • validating provider-native web behavior against the local fallback
  • running sidechains with lower-cost or faster models
  • ensuring that one run does not inherit the wrong session or daemon default route unexpectedly