Recovered run memory
Recovered run memory is Kheish’s daemon-owned episodic memory layer. It is built from terminal runs and exists for two related jobs:- recover a small amount of recent run history into the next prompt without replaying whole transcripts
- expose compact run-derived memory to operator inspection and session-scoped memory search
What it is
Recovered run memory is currently:- daemon-owned rather than model-provider-owned
- derived from persisted run state
- compact and episodic rather than semantic or procedural
- prompt-bounded and best-effort
- restart-safe
- only the current session receives a recovered-memory prompt section
- only a small bounded subset of recent entries is injected
- the daemon stores run-memory records separately from the session journal
- the daemon indexes them by session and by visible learning scope
- session memory search can browse or query recovered runs visible through the session’s current learning scopes
What is captured
Kheish only builds recovered run memory from terminal runs:completedfailedinterruptedcancelled
- the owning
session_id - the originating
run_id - the capture timestamp
- the terminal run status
- one compact
summary - optional
request_preview - optional
outcome_preview - optional daemon-owned
artifact_ids - optional compact
failure_markers - visible
scope_keysretained for later search and retrieval - one semantic-capture replay receipt
- the run request preview
- the latest recorded assistant text
- an error preview for failed runs
- a terminal status note when no better output exists
Storage model
Recovered run memory lives outside the append-only session journal. The current storage layout has two parts:- a filesystem-backed
run-memories/store under the daemon state root - a daemon topology index that tracks run-memory pointers by session and by scope
Prompt projection
On a new input, the daemon loads the current session’s tracked run-memory pointers and builds one boundedrecovered_memory bundle.
Current limits:
- up to
32tracked run-memory pointers per session - up to
3recovered-memory entries injected into one prompt
recovered_memory. The core engine strips that derived payload before persisting the canonical input event so the session journal does not duplicate recovered memory back into the transcript.
Operators can inspect the effective projection through:
GET /v1/sessions/{session_id}/memory-context
Session memory search
Recovered run memory also participates in the session memory-search surface:GET /v1/sessions/{session_id}/memory-search
- prompt injection uses only the current session’s bounded recovered-memory bundle
- memory search can browse or query recovered runs visible through the session’s current learning scopes
- when
queryis omitted, the daemon returns a recent browse view - when
queryis present, the daemon ranks visible learnings, recovered runs, and visible skills lexically - recovered-run results come from the daemon’s tracked run-memory index, not from transcript replay
Budgeting and overflow avoidance
Recovered run memory is packed before prompt submission instead of being appended blindly. The runtime estimates the current prompt cost from:- the system sections
- visible messages
- active tool state
- restored compacted history
- the incoming input payload
- the daemon compaction policy budget
- model-aware context-window reservations when Kheish knows the selected model family
Retention and pruning
Recovered run memory currently uses daemon-side retention rules:- records older than
30days are pruned - only the newest
32tracked records per session are kept - pruned records are deleted from
run-memories/
Semantic-capture replay receipts
Run-memory records now also carry a durable semantic-capture receipt:pendingcompletedskipped
- mark a completed run as needing semantic extraction
- survive crashes and restarts without duplicating extraction
- replay only unfinished capture work on boot
pending until extraction has either completed or abstained. On boot, the daemon replays only those pending records.
Current scope
Recovered run memory is still intentionally narrow:- compact episodic memory only
- no vector store
- no embeddings
- no free-form semantic promotion inside this layer
- no procedural promotion inside this layer
Validation
This implementation is covered by:- unit tests for indexing, retention, pruning, prompt budgeting, and file-store behavior
- real-daemon tests for recovery, restart, corrupted files, and pruning
- live-provider tests for both OpenAI and Anthropic that verify recovered memory appears in provider request debug artifacts
