State and recovery
Durability is a core design property of Kheish.Session journal
Session state is persisted as append-only JSONL records. The journal stores:- conversation events
- checkpoints
- metadata records
- permission audit records
- output records
- the optional bound persona snapshot for that session, including the bound capability baseline and resolved persona default inline skills
- the optional persisted session capability scope override
- the optional persisted session credential scope override
- the optional persisted session reply-target defaults
- channel records
- public message logs
- aggregated reactions
- public turn leases
- queued channel stimuli
- canonical thread-work state, including work bindings and progress snapshots
Checkpoints and compaction
Long-running sessions can accumulate too much context to replay verbatim. Kheish uses checkpoints and compaction to:- summarize older context
- preserve durable runtime metadata
- restore active control sections such as tasks, hooks, skills, and MCP instructions after compaction
Restore model
On restart, the daemon rebuilds runtime state from persisted records. This includes:- the root session and agent mapping
- stored checkpoints and metadata
- persisted session persona bindings
- persisted session capability scope overrides
- persisted session credential scope overrides
- persisted session reply-target defaults
- persisted channels, public messages, reactions, turn leases, stimuli, and canonical thread-work state
- persisted projects, project members, linked channels, and project tasks
- persisted Playbook catalogs and Flow correlation records
- pending approvals and structured questions
- active inline skills and control state
- run scheduler and delivery queue state
- daemon-managed runtime connectors
ChannelDelivery runs. The daemon can:
- re-queue claimed stimuli that still need worker attention
- reuse an already materialized public post instead of duplicating it
- rebuild the canonical thread-work projection from durable public messages
- repair or drop stale progress snapshots and work bindings when they no longer match the correct root thread
State root discipline
Operationally, the state root matters as much as the daemon binary. Many apparent bugs are actually state-root mixups, schema drift between binaries, or stale sessions created by an older daemon build. Treat the active state root as part of the deployment identity. The state root also contains daemon-managed connectors, the encrypted secret slots they may reference, broker revocation state for issued auth grants and leases, signed external-action audit ledgers, Playbook catalogs, and Flow correlation records. When you migrate or back up a daemon, keep the session journal, project store, channel store, playbook/flow store, connector store, auth store, and external-action audit state together.Evidence Note
- Code verified:
crates/kheish-daemon/src/playbooks.rs,crates/kheish-daemon/src/services/playbook.rs,crates/kheish-daemon/src/state/playbook_workflow.rs,crates/kheish-daemon/src/state/persistence.rs. - CLI/API verified: Playbook/Flow persistence and recovery are exposed through
playbooks get/listandflows get/list. - Daemon live tested for this note: no; deterministic restart/API tests cover Flow record recovery.
- Provider-specific tested for this note: no; state restoration is provider-neutral.
