Assets API
The assets API stores files inside the daemon state root and returns stable identifiers that can be referenced from later session inputs.HTTP endpoints
GET /v1/assetsGET /v1/assets/{asset_id}GET /v1/assets/{asset_id}/referencesGET /v1/assets/{asset_id}/rawPOST /v1/assetsDELETE /v1/assets/{asset_id}?dry_run=trueDELETE /v1/assets/{asset_id}POST /v1/assets/gc
GET /v1/assets supports an optional query filter over the asset identifier, file name, media type, digest, or derivation id.
Create an asset
POST /v1/assets accepts one inline upload payload:
Asset views
Asset summaries returned byGET /v1/assets include:
asset_idmedia_typefile_namesha256byte_lengthcreated_at_ms
GET /v1/assets/{asset_id} returns the full asset view, including:
uri, an opaque daemon-managed raw storage referencetext_uri, an opaque daemon-managed derived-text reference when the daemon extracted prompt-ready text from the assettext_sha256andtext_byte_length, integrity metadata for the derived text payload when presentpreview_image_uri, an opaque daemon-managed visual preview reference when one existspreview_image_media_type, the MIME type for that previewpreview_image_sha256andpreview_image_byte_length, integrity metadata for the derived preview payload when presentderivation_ids, derivations whose result points at this asset
Reference inspection
GET /v1/assets/{asset_id}/references returns the durable places that currently point at one asset:
Delete and GC
DELETE /v1/assets/{asset_id}?dry_run=true returns the deletion plan without mutating state. The
plan includes blocked, reclaimable_bytes, the removable daemon-owned files, and the same hard
and soft references used by the reference endpoint.
DELETE /v1/assets/{asset_id} executes the same plan only when hard_reference_count is zero. Hard
references return 409 application/problem+json with domain: "assets" and
code: "asset_delete_blocked". Soft references, currently purged observation lineage, do not block
physical deletion.
POST /v1/assets/gc plans or executes catalog-asset garbage collection:
dry_run defaults to true. GC execution deletes only catalog assets with no hard references and
reports blocked assets instead of failing the whole operation. It also reports and can remove
orphan payload files under assets/raw, assets/text, and assets/preview when no loaded asset
metadata references their opaque asset://... URI. Daemon atomic-write temp files and tombstones
are ignored. Reclaimable bytes are computed from current file metadata for raw payloads, owned
derived text/previews, asset metadata, and orphan payload files; tombstones are not counted as
reclaimable.
Supported media types
The daemon currently accepts:text/plaintext/csvtext/markdownapplication/jsonapplication/pdfapplication/dxfimage/pngimage/jpegaudio/wavaudio/webmaudio/mpegaudio/mpgaaudio/opusaudio/aacaudio/flacaudio/mp4audio/m4aaudio/pcmaudio/l16audio/l24
audio/webm;codecs=opus; the daemon strips
those parameters before alias normalization and validation. Unsupported or mismatched file types are
rejected before the asset is stored. Audio imports are
sniffed before persistence: WAV must be RIFF/WAVE with coherent PCM/float/extensible fmt ,
non-empty block-aligned data, and either exact RIFF lengths or provider-style streaming length
sentinels, plus an in-limit calculable duration. WebM must expose the WebM DocType, a Segment,
a supported Opus/Vorbis audio track, and non-empty media data. MP3/MPGA must
contain a complete MPEG audio frame directly or after an ID3 tag, and the first frame’s declared
sample rate/channel mode must fit daemon limits. Opus must be an Ogg page with an OpusHead
packet and valid channel count. AAC must be ADTS/ADIF with bounded frame length and, for ADTS,
valid sample-rate/channel metadata. FLAC must include the mandatory STREAMINFO block with valid
sample-rate/channel metadata and an in-limit duration when total samples are declared. MP4/M4A must
contain ftyp, non-empty media data, and an audio track handler. Raw PCM formats do not have
container magic or channel/sample-rate metadata, so the daemon enforces non-empty payloads and
sample-width alignment for audio/pcm, audio/l16, and audio/l24.
Normalization behavior
Important daemon-side rules:- imports are deduplicated by normalized media type and content digest
- CSV aliases normalize to
text/csv - DXF aliases normalize to
application/dxf - MP3 aliases normalize to
audio/mpeg, MPGA aliases normalize toaudio/mpga, AAC/FLAC/OPUS aliases normalize to theiraudio/*forms, PCM aliases normalize toaudio/pcm, and M4A aliases normalize toaudio/m4a - PNG and JPEG uploads are decoded and re-encoded into normalized daemon-owned image payloads
- PDFs and supported text documents keep their raw payloads and, when possible, a derived text representation
- supported audio assets can later gain a derived
text_uriwhen the daemon performs speech-to-text and a transcription backend is configured - generated audio and transcription uploads revalidate through the same daemon audio parser before persistence/provider upload
- DXF uploads also derive a PNG preview when the daemon can parse the plan
- derived document text is what text-only routes consume later during prompt rendering
GET /v1/assets/{asset_id}/raw when a client needs the exact persisted bytes rather than the metadata view.
Raw reads re-check the stored byte length and SHA-256 before returning bytes. Derived text and preview reads also re-check their stored byte length and SHA-256 before prompt rendering, derivation/observation materialization, or provider attachment dispatch. If an on-disk payload changes while the daemon is running, the endpoint or run path returns 409 application/problem+json with domain: "assets" and code: "asset_integrity_mismatch".
Startup repair
GET /v1/status includes storage.asset_repair, a bounded summary of asset startup repair work
captured when the daemon loaded the asset store. Repairs that restore derived files are surfaced as
health info; skipped metadata records caused by untrusted raw payloads are surfaced as health
warning.
On startup, the daemon:
- quarantines corrupt
assets/meta/*.jsonfiles instead of aborting the whole store load - preserves asset ID allocation across valid, skipped, and quarantined metadata files
- skips metadata whose raw payload is missing, no longer matches its stored byte length/SHA-256, has an id that disagrees with the metadata filename, or points at another asset’s raw URI
- backfills legacy derived text/preview checksums only after the derived bytes match the valid raw payload’s current derivation
- restores missing, tampered, malformed, or incoherent owned derived text/preview payloads from the valid raw payload before exposing the asset
- preserves verified attached text references that point at another loaded
text/plainraw asset, and replaces other non-owned derived text URIs with the canonical derived text when one can be regenerated - removes only daemon atomic-write temp files from asset directories, leaving real orphan payloads for future GC tooling
- validates durable tombstones before treating them as committed deletes, quarantines corrupt tombstone JSON, and completes partial delete crash windows by removing the tombstoned asset metadata plus daemon-owned raw/text/preview payloads in a single startup scan
Limits
Current limits enforced by the daemon:- maximum raw upload size: 12 MiB
- maximum source image edge before decode: 16,384 pixels
- maximum source image area before decode: 50,000,000 pixels
- maximum source image decode allocation: 256 MiB
- maximum normalized image edge after decode: 2,048 pixels
- maximum normalized image size after re-encode: 4 MiB
- maximum PDF text extraction pages: 128
- maximum PDF object count before text extraction: 10,000
- maximum PDF stream count before text extraction: 2,048
- maximum individual PDF stream size before text extraction: 12 MiB
- maximum total PDF stream size before text extraction: 12 MiB
- maximum extracted PDF text size: 4 MiB
- maximum DXF preview primitives: 20,000
- maximum DXF preview line/polyline segments: 20,000
- maximum points retained for one DXF polyline preview: 20,000
- audio sample-rate metadata limit: 8,000-192,000 Hz when declared by the container
- audio channel metadata limit: 1-8 channels when declared by the container
- maximum declared/calculable audio duration: 30 minutes
