Skip to main content

Recovered run memory

Recovered run memory is Kheish’s daemon-owned episodic memory layer. It is built from terminal runs and exists for two related jobs:
  • recover a small amount of recent run history into the next prompt without replaying whole transcripts
  • expose compact run-derived memory to operator inspection and session-scoped memory search
It is not the same thing as durable semantic learning, and it is not a free-form user note system.

What it is

Recovered run memory is currently:
  • daemon-owned rather than model-provider-owned
  • derived from persisted run state
  • compact and episodic rather than semantic or procedural
  • prompt-bounded and best-effort
  • restart-safe
The prompt projection remains intentionally narrow:
  • only the current session receives a recovered-memory prompt section
  • only a small bounded subset of recent entries is injected
The storage and search surfaces are broader:
  • the daemon stores run-memory records separately from the session journal
  • the daemon indexes them by session and by visible learning scope
  • session memory search can browse or query recovered runs visible through the session’s current learning scopes

What is captured

Kheish only builds recovered run memory from terminal runs:
  • completed
  • failed
  • interrupted
  • cancelled
Each durable run-memory record stores:
  • the owning session_id
  • the originating run_id
  • the capture timestamp
  • the terminal run status
  • one compact summary
  • optional request_preview
  • optional outcome_preview
  • optional daemon-owned artifact_ids
  • optional compact failure_markers
  • visible scope_keys retained for later search and retrieval
  • one semantic-capture replay receipt
The compact summary is derived from durable daemon state rather than from caller-local prompt text. It may include:
  • the run request preview
  • the latest recorded assistant text
  • an error preview for failed runs
  • a terminal status note when no better output exists
The summary is capped daemon-side instead of storing a full transcript excerpt.

Storage model

Recovered run memory lives outside the append-only session journal. The current storage layout has two parts:
  • a filesystem-backed run-memories/ store under the daemon state root
  • a daemon topology index that tracks run-memory pointers by session and by scope
The session journal and checkpoints remain the canonical conversation history. Recovered run memory is a derived read model used for recovery and search.

Prompt projection

On a new input, the daemon loads the current session’s tracked run-memory pointers and builds one bounded recovered_memory bundle. Current limits:
  • up to 32 tracked run-memory pointers per session
  • up to 3 recovered-memory entries injected into one prompt
Entries are ordered newest first. If a run-memory file is missing or unreadable, the daemon skips it, prunes the stale pointer, and continues the run. The runtime then packs the recovered bundle into one system section named recovered_memory. The core engine strips that derived payload before persisting the canonical input event so the session journal does not duplicate recovered memory back into the transcript. Operators can inspect the effective projection through:
  • GET /v1/sessions/{session_id}/memory-context
That derived session view shows recovered memory alongside learned context and session-visible skills. Recovered run memory also participates in the session memory-search surface:
  • GET /v1/sessions/{session_id}/memory-search
Important distinction:
  • prompt injection uses only the current session’s bounded recovered-memory bundle
  • memory search can browse or query recovered runs visible through the session’s current learning scopes
That means a session can search more recovered run records than it will automatically inject into the next prompt. Current search behavior:
  • when query is omitted, the daemon returns a recent browse view
  • when query is present, the daemon ranks visible learnings, recovered runs, and visible skills lexically
  • recovered-run results come from the daemon’s tracked run-memory index, not from transcript replay

Budgeting and overflow avoidance

Recovered run memory is packed before prompt submission instead of being appended blindly. The runtime estimates the current prompt cost from:
  • the system sections
  • visible messages
  • active tool state
  • restored compacted history
  • the incoming input payload
It then applies a recovered-memory budget derived from:
  • the daemon compaction policy budget
  • model-aware context-window reservations when Kheish knows the selected model family
When recovered memory does not fit, Kheish drops older entries first. If nothing fits, it omits recovered memory entirely rather than forcing overflow risk into the prompt.

Retention and pruning

Recovered run memory currently uses daemon-side retention rules:
  • records older than 30 days are pruned
  • only the newest 32 tracked records per session are kept
  • pruned records are deleted from run-memories/
The daemon rebuilds the run-memory index on boot from persisted runs plus persisted run-memory files, then writes the pruned result back to the topology index.

Semantic-capture replay receipts

Run-memory records now also carry a durable semantic-capture receipt:
  • pending
  • completed
  • skipped
This receipt does not change the recovered-memory prompt path directly. It exists so daemon-owned semantic capture can:
  • mark a completed run as needing semantic extraction
  • survive crashes and restarts without duplicating extraction
  • replay only unfinished capture work on boot
When semantic capture is enabled, completed runs can leave their run-memory record in pending until extraction has either completed or abstained. On boot, the daemon replays only those pending records.

Current scope

Recovered run memory is still intentionally narrow:
  • compact episodic memory only
  • no vector store
  • no embeddings
  • no free-form semantic promotion inside this layer
  • no procedural promotion inside this layer
Durable semantic memory and promoted procedures live in the separate learning plane.

Validation

This implementation is covered by:
  • unit tests for indexing, retention, pruning, prompt budgeting, and file-store behavior
  • real-daemon tests for recovery, restart, corrupted files, and pruning
  • live-provider tests for both OpenAI and Anthropic that verify recovered memory appears in provider request debug artifacts