diff --git a/docs/RECEIPTS.md b/docs/RECEIPTS.md new file mode 100644 index 000000000..91e031b17 --- /dev/null +++ b/docs/RECEIPTS.md @@ -0,0 +1,139 @@ +# Runtime Receipts + +This document sketches a future read-only receipt export for completed runtime +turns. It is a protocol note, not an implemented endpoint. + +The goal is to let a local supervisor audit one completed turn without +screen-scraping the terminal transcript. A receipt should summarize the durable +runtime records that CodeWhale already owns: thread metadata, turn status, turn +items, event sequence lineage, usage when available, approval decisions, and +side-effect boundaries. + +## Non-Goals + +A receipt is not a safety certification, provider compatibility certification, +or hosted attestation. It must not call providers, execute tools, write memory, +write project files, mutate runtime state, or expose API keys. + +Receipts should not export raw chain-of-thought or private reasoning by default. +When reasoning custody is represented, use stable item ids, counts, hashes, or +explicit `unavailable` fields rather than raw hidden content. + +## Candidate Surfaces + +Potential local-only surfaces: + +```text +codewhale receipt export --thread --turn --format json +GET /v1/threads/{thread_id}/turns/{turn_id}/receipt +``` + +Both surfaces should share the existing runtime API auth boundary. They should +only read persisted runtime records and append-only events. + +## Current Data Sources + +The current runtime store already persists the core inputs a receipt builder +would need: + +- `ThreadRecord`: model, workspace, mode, shell/trust/auto-approve flags, + title, task linkage, and latest turn metadata. +- `TurnRecord`: turn status, input summary, timestamps, duration, usage, error, + steer count, and item ids. +- `TurnItemRecord`: item kind, lifecycle status, summary, optional detail, + metadata, artifact refs, and item timestamps. +- `RuntimeEventRecord`: thread id, turn id, item id, event name, JSON payload, + timestamp, and monotonic `seq` values per runtime store. + +Not every receipt field can be filled from those records today. If a provider or +store does not persist a value, the receipt should say `available: false` or +`unavailable`, not infer it from UI text. + +## Draft Schema Shape + +```json +{ + "schema_id": "codewhale.conformance-receipt/v0", + "thread": { + "id": "thr_...", + "model": "deepseek-v4-pro", + "mode": "agent", + "auto_approve": false, + "trust_mode": false, + "allow_shell": false + }, + "turn": { + "id": "turn_...", + "status": "completed", + "started_at": "2026-06-02T01:00:00Z", + "ended_at": "2026-06-02T01:00:12Z", + "duration_ms": 12000 + }, + "reasoning_custody": { + "raw_reasoning_exported": false, + "available": false, + "reason": "reasoning blocks are not persisted as receipt-ready records" + }, + "tool_lineage": { + "tool_call_count": 1, + "tool_result_count": 1, + "unmatched_tool_call_ids": [], + "unmatched_tool_result_ids": [] + }, + "usage_evidence": { + "available": true, + "usage": { + "prompt_tokens": 123, + "completion_tokens": 45 + }, + "provider_cache_breakdown_available": false + }, + "source_event_lineage": { + "first_seq": 10, + "last_seq": 42, + "event_count": 33, + "missing_event_ranges": [] + }, + "side_effect_boundary": { + "approval_required_count": 1, + "approval_allowed_count": 0, + "approval_denied_count": 1, + "command_execution_count": 0, + "file_change_count": 0, + "sandbox_denied_count": 0 + }, + "claim_ceiling": [ + "local_receipt_only", + "not_safety_certification", + "not_provider_compatibility_certification" + ] +} +``` + +## Builder Rules + +A receipt builder should be deterministic and conservative: + +1. Load the thread and turn by id, then reject mismatched `thread_id` values. +2. Load only item ids referenced by the turn. +3. Read event records for the thread and filter by `turn_id`. +4. Preserve event sequence boundaries with `first_seq`, `last_seq`, and any + detected gaps. +5. Count approval, command, file, sandbox, and tool events from typed records or + known event names only. +6. Mark unavailable evidence explicitly instead of deriving it from free-form + summaries. +7. Emit no raw tool output beyond existing item summaries unless a later schema + adds a separate redaction policy. + +## Incremental Implementation Path + +The safest implementation path is: + +1. Land this protocol note and settle field names/non-goals. +2. Add protocol structs and JSON snapshot fixtures for completed, failed, and + approval-denied turns. +3. Add a pure builder over `ThreadRecord`, `TurnRecord`, `TurnItemRecord`, and + `RuntimeEventRecord`. +4. Expose the local runtime API endpoint. +5. Add the CLI export command and optional validation mode. diff --git a/docs/RUNTIME_API.md b/docs/RUNTIME_API.md index 9ebfada8c..a3582c82c 100644 --- a/docs/RUNTIME_API.md +++ b/docs/RUNTIME_API.md @@ -22,6 +22,10 @@ macOS workbench (or any local supervisor) The engine runs as a local-only process. All APIs bind to `localhost` by default. No hosted relay, no provider-token custody, no secret leakage. +For a proposed read-only audit export over completed turns, see +[`docs/RECEIPTS.md`](RECEIPTS.md). That document is a protocol note; the receipt +CLI/API surfaces are not implemented yet. + ## ACP stdio adapter: `codewhale serve --acp` `codewhale serve --acp` speaks JSON-RPC 2.0 over newline-delimited stdio for @@ -215,6 +219,9 @@ accept an empty string to clear a previously-set value. Added in v0.8.10 (#562): **Events** (SSE replay + live stream) - `GET /v1/threads/{id}/events?since_seq=` +**Receipts** (future read-only audit export) +- Proposed only: `GET /v1/threads/{thread_id}/turns/{turn_id}/receipt` + **Compatibility stream** (one-shot, backwards-compatible) - `POST /v1/stream`