Proposal: read-only conformance receipt export for runtime thread/turn replay

Hi CodeWhale maintainers,

I have been looking at the public CodeWhale runtime surface as a local supervisor integration point. The existing runtime model already has most of the ingredients that downstream tools would need for auditability: durable `ThreadRecord` / `TurnRecord` / `TurnItemRecord`, append-only events, usage aggregation, approval events, sandbox boundaries, and workspace rollback.

Would a small read-only receipt export fit the project roadmap?

The narrow goal would be to let local supervisors verify one completed turn without screen-scraping terminal output:

- reasoning-block custody without exporting raw reasoning by default
- tool-call to tool-result lineage
- cache/usage evidence when available, with explicit `unavailable` when provider-specific hit/miss is not stored
- event sequence lineage across replay/resume
- approval, sandbox, YOLO/auto-approve, command/file-change side-effect boundaries
- rollback/snapshot availability

One possible local-only surface:

```text
codewhale receipt export --thread <thread_id> --turn <turn_id> --format json
GET /v1/threads/{thread_id}/turns/{turn_id}/receipt
```

This should be read-only and reuse the existing local runtime auth boundary. It should not call providers, execute tools, write memory, write target files, or expose API keys. For privacy, raw reasoning should stay omitted by default; a receipt can use counts, hashes/refs, and item/event ids instead.

Example sketch:

```json
{
  "schema_id": "codewhale.conformance-receipt/v0",
  "thread": {
    "id": "thr_...",
    "model": "deepseek-v4-pro",
    "mode": "agent",
    "auto_approve": false
  },
  "turn": {
    "id": "turn_...",
    "status": "completed"
  },
  "reasoning_custody": {
    "raw_reasoning_exported": false,
    "reasoning_blocks_observed": true,
    "reasoning_block_hashes": ["sha256:..."]
  },
  "tool_lineage": {
    "tool_call_count": 1,
    "tool_result_count": 1,
    "unmatched_tool_call_ids": [],
    "unmatched_tool_result_ids": []
  },
  "cache_evidence": {
    "available": true,
    "cached_tokens": 0,
    "reasoning_tokens": 0,
    "prompt_cache_hit_tokens": null,
    "prompt_cache_miss_tokens": null
  },
  "source_event_lineage": {
    "first_seq": 1,
    "last_seq": 42,
    "event_count": 42,
    "missing_event_ranges": []
  },
  "side_effect_boundary": {
    "mode": "agent",
    "auto_approve": false,
    "approval_required_count": 1,
    "approval_allowed_count": 0,
    "approval_denied_count": 1,
    "command_execution_count": 0,
    "file_change_count": 0,
    "sandbox_denied_count": 0
  },
  "claim_ceiling": [
    "local_receipt_only",
    "not_safety_certification",
    "not_provider_compatibility_certification"
  ]
}
```

A low-risk implementation path could be docs-first:

1. Add a short receipt-export section or `docs/RECEIPTS.md` with the intended schema and boundaries.
2. Add protocol structs and snapshot tests.
3. Add a pure receipt builder over persisted thread/turn/item/event records.
4. Expose the local runtime API endpoint.
5. Add the CLI export/validate command.

Open questions:

- Are reasoning blocks persisted as item metadata, or only streamed to the UI?
- Is per-turn DeepSeek cache hit/miss metadata available, or only aggregate cached/reasoning tokens?
- Is side-git snapshot metadata addressable per turn from the runtime store?
- Would this be better as docs/protocol-fixture first, rather than starting with runtime endpoint code?

This is not a compatibility certification, safety claim, or endorsement request. It is a proposal for a small local audit/export surface that seems aligned with the runtime API contract and CodeWhale's local-first security boundary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: read-only conformance receipt export for runtime thread/turn replay #2556

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Proposal: read-only conformance receipt export for runtime thread/turn replay #2556

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions