Providers and routing
Kheish distinguishes between a configured route and the underlying provider driver. A configured route has:- a stable
route_idsuch asopenai,anthropic,openrouter, orresearch - one underlying provider driver such as
openai,anthropic,openrouter,google, orxai - one current model
- coarse capabilities exposed to the daemon, such as multimodal input, native web search, image generation, and image editing
research can use the openrouter driver without being the same route as openrouter.
Kheish currently supports these first-class drivers:
anthropicgoogleopenaiopenrouterxai
Daemon route inventory
One daemon can expose multiple routes at once. The runtime keeps a daemon-owned route inventory that includes:- a default route used when no more specific policy applies
- one or more named routes such as
openai,anthropic,openrouter, orresearch - the current model and underlying driver for each route
- coarse route capabilities such as multimodal input, native web search, image generation support, and image editing support
- one session on Anthropic
- another session on OpenAI
- another session on a custom
researchroute backed by OpenRouter - one child sidechain on a cheaper route
- one image-editing run on one named route while text orchestration remains on another
Routes file
The recommended multi-route startup path isserve --routes-file ....
The file format is TOML:
state_root and keep reusing it. Replacing it later makes existing encrypted secret slots unreadable.
model_support = "any" is the normal choice for OpenRouter routes because OpenRouter model identifiers are typically vendor-prefixed, such as openai/gpt-5.4-mini or anthropic/claude-sonnet-4.
The repository also ships routes.default.toml at the repository root as a daemon-managed baseline for the built-in anthropic and openai routes.
Rules enforced by the daemon:
versionmust currently be1- each route lives under
[routes.<route_id>] route_idmust be non-empty, must not contain/, and must not contain whitespace- each route must define
driveranddefault_model - if multiple routes are configured,
default_routemust be set in the file serve --default-route ...can override a validdefault_routefrom the file, but it does not currently make a missing multi-routedefault_routevalid
- core route fields:
driver,default_model - recommended secret reference:
auth_ref - direct auth inputs:
api_key,api_key_env - model compatibility policy:
model_support = "family"ormodel_support = "any" - transport:
base_url - OpenAI-only metadata:
organization,organization_env,project,project_env - OpenAI-only auth fallback:
openai_auth_source,openai_auth_file - Anthropic-only auth fallback and request tuning:
anthropic_auth_source,anthropic_credentials_file,anthropic_version,anthropic_beta_headers - capability overrides:
multimodal_input,native_web_search,image_generation,image_edit
auth_ref, when present, must point at an existing daemon-managed secret slot- the daemon validates all configured
auth_refvalues at startup runtime getreturnsauth_refondefault_routeand each entry inrouteswhen one is configured, but never the underlying secret bytes- one secret slot can be shared by multiple routes without duplicating credentials in the route file
- OpenRouter routes are API-key-backed routes; they do not use OpenAI account auth imports such as Codex
- session or run routing decides which
route_idshould be used - the broker still resolves the actual credential material at request time
- the execution’s effective
CredentialScopecan therefore deny one route even when the route exists in the daemon inventory and was selected successfully
CredentialScope.route_allow and CredentialScope.route_deny when you need to delegate route access explicitly to one session or one child sidechain.
Selector grammar
The daemon CLI normalizes route-aware model selectors on these commands:runtime set-modelsessions inputsessions set-routeagents spawn-sidechainschedules create
gpt-5.4<route_id>/<model>
openrouter/openai/gpt-5.4-mini valid when the configured route id is openrouter.
Important normalization rules:
- the route prefix is recognized only when the segment before the first
/matches a known daemon route id - otherwise the full value is treated as a raw model string
<route_id>/without a model suffix is rejected--fallback-modelfollows the same grammar as--model- if
--providerand a selector prefix point at different routes, the CLI fails instead of guessing - when the selector contains a route prefix, that prefix is removed before the backend request is sent
Route precedence
Kheish resolves the effective route in this order:- explicit run override
- persisted session route policy
- daemon default route
Session route policy
Sessions can now persist a route policy instead of relying only on per-run overrides. Use this when one session should keep targeting a specific route across later runs, resumes, schedules, or mailbox execution. Child sidechain sessions do not automatically inherit the parent session’s stored route policy. They only persist a route policy when the spawn request carries explicit route fields or an explicitroute_policy.
This behavior is intentionally different from session personas. A child sidechain session does inherit the parent session’s bound persona snapshot at spawn time.
The control plane exposes this as a real session resource:
POST /v1/sessions/{session_id}/route-policyPUT /v1/sessions/{session_id}/route-policyDELETE /v1/sessions/{session_id}/route-policy
sessions set-route.
Run-scoped generation controls
Generation settings can be supplied with individual requests. Common controls include:- route id carried in the
providerrequest field - model
- fallback model
- temperature
- max output tokens
- tool choice
- response format
provider request field carries the selected route id after normalization. This keeps route selection explicit without forcing every session to change global defaults.
Capability-sensitive routing
Not every multimodal input requires the same model capability.- image inputs require a vision-capable route
- supported document inputs can still execute on non-vision routes because the daemon renders bounded document text for the model
multimodal_input mainly covers image inputs and document-derived text or previews. Raw audio generation and speech-to-text are separate daemon features.
This distinction matters operationally. A route that is correct for text and PDF summaries may still be wrong for PNG or JPEG inspection.
Route capabilities are also surfaced through GET /v1/runtime, so operators and SDKs can inspect the daemon inventory before submitting work.
Driver defaults today are:
- Anthropic: multimodal input and native web search, but no daemon image backend
- Google: multimodal input plus daemon image generation and image editing, but no native
web_search - OpenAI: multimodal input, native web search, daemon image generation, and daemon image editing
- OpenRouter: multimodal input plus daemon image generation and image editing, but no native
web_search - xAI: multimodal input, native web search, and daemon image generation, but no daemon image editing
generate_audio is surfaced by control-tool availability, not by a dedicated GET /v1/runtime capability field today.
Provider-native web search
web_search remains one logical tool, but the daemon now prefers a provider-native backend when the effective route supports it.
Current behavior:
- Anthropic routes can use native provider web search
- Google routes currently do not expose native provider web search through
web_search - OpenAI routes can use native provider web search
- OpenRouter routes currently do not expose native provider web search through
web_search - xAI routes can use native provider web search
- unsupported routes or unsupported request shapes fall back to the local DuckDuckGo HTML implementation
web_fetch remains daemon-local.
Image routing
The image tools are now selected by image route id, not only by provider family. Current behavior:generate_imageresolves the backend in this order: explicit tool route override, current run route when that route has an image backend, then the image service default routeedit_imageresolves the backend in this order: explicit tool route override, current run route when that route supports editing, then the default image route when it supports editing, then the first configured edit-capable backend- the tool request field is still called
providerfor compatibility, but on a named-route daemon it means one configured image route id - the tool response field
providerstill reports the underlying backend provider such asopenaiorgoogle - route-level
image_generationandimage_editare independent capabilities; a route can expose generation without exposing editing
Audio routing
generate_audio follows the same route-first pattern as the image tools.
Current behavior:
generate_audioresolves the backend in this order: explicit tool route override, current run route when that route has an audio backend, then the audio service default route- the tool request field is still called
providerfor compatibility, but on a named-route daemon it means one configured audio route id - the tool response field
providerreports the underlying backend provider - the built-in audio-generation backend today is OpenRouter
- audio generation currently appears through tool availability rather than one dedicated route capability flag in
runtime get
Fallback behavior
Kheish can be configured with a primary route and an optional fallback model on that same resolved route. The routing layer does not pretend that all routes are interchangeable.--fallback-model uses the same selector grammar as --model, but after the primary route is resolved the fallback is validated against that route and stored as a normalized model string.
Changing the daemon default route does not rewrite:
- persisted session route policies
- queued runs
- active runs
- restored suspended runs
Context-sensitive prompt recovery
Recovered run memory is packed against prompt budget before provider submission. In practice this means Kheish does not append recovered memory blindly. It estimates the current prompt size, reserves output tokens, applies a safety buffer, and then injects only the recovered-memory entries that still fit. If they do not fit, it omits them rather than forcing a context overflow. Read Recovered run memory for the daemon-side storage and retention model.Operational guidance
Use explicit route and model settings in live validation and production-sensitive flows. This is especially useful when:- comparing route behavior on the same task
- validating provider-native web behavior against the local fallback
- running sidechains with lower-cost or faster models
- ensuring that one run does not inherit the wrong session or daemon default route unexpectedly
