Summary
Add an optional reporting / history capability to diffwarden: persist each review's findings to disk so they can be aggregated and analyzed over time, rather than being printed once and lost.
Today diffwarden resolves a target, runs reviewers, and renders Markdown/JSON to stdout. There is no durable record of what was found, when, against which diff, by which reviewer. A reporting layer would let users answer questions like "what kinds of issues do my reviewers catch most often?", "how often do findings get introduced?", and "is reviewer X pulling its weight?".
Goals
- Persist a structured record per review run (findings, reviewer(s)/model/effort, target, timestamp, repo, commit/branch context).
- Make it easy to aggregate runs later for analysis.
Design considerations
Opt-in, not on by default
This is fundamentally a logging feature, and some users will not want it — they may not want review findings (which can echo source/diff content) written to disk at all. Reporting should be explicitly opt-in (flag and/or config), off by default, and easy to disable entirely.
Storage location: global default, repo-local option
- Default to a global store (e.g.
~/.diffwarden/reports/ or an XDG-style path) so history spans all repos a user works in.
- Allow a repo-local store as well (e.g.
.diffwarden/reports/ in the repo) for users who want history committed/colocated with the project. This should be configurable per-repo.
- Define the on-disk format (likely append-only JSONL or one JSON file per run) so external scripts can consume it without parsing diffwarden internals.
Session pairing (stretch / likely out of scope for v1)
The most interesting analysis would pair a report to the coding session that triggered the review (e.g. attribute findings to the agent run / change that produced them). That's likely too much to build into the CLI directly — the CLI doesn't own session context.
Proposal: have the report capture enough raw context (timestamp, target, commit/branch, cwd, diff identity) that an external analysis script can reconstruct session association after the fact, rather than trying to thread live session info through the CLI.
Open questions
- Config surface: a
reporting block in config vs. CLI flags (--report, --report-dir, --report-scope global|repo)?
- File format: JSONL append vs. per-run files? What's the stable schema (versioned)?
- Redaction: do we store finding text/snippets verbatim, or offer a metadata-only mode for privacy-sensitive users?
- Retention: any pruning/rotation, or leave that to the user/external tooling?
- Is there a
diffwarden report subcommand for basic aggregation, or do we ship only the raw store + document the external-script path?
Out of scope (for this issue)
- The analysis/aggregation tooling itself (separate effort/script).
- Live coding-session integration.
Summary
Add an optional reporting / history capability to diffwarden: persist each review's findings to disk so they can be aggregated and analyzed over time, rather than being printed once and lost.
Today diffwarden resolves a target, runs reviewers, and renders Markdown/JSON to stdout. There is no durable record of what was found, when, against which diff, by which reviewer. A reporting layer would let users answer questions like "what kinds of issues do my reviewers catch most often?", "how often do findings get introduced?", and "is reviewer X pulling its weight?".
Goals
Design considerations
Opt-in, not on by default
This is fundamentally a logging feature, and some users will not want it — they may not want review findings (which can echo source/diff content) written to disk at all. Reporting should be explicitly opt-in (flag and/or config), off by default, and easy to disable entirely.
Storage location: global default, repo-local option
~/.diffwarden/reports/or an XDG-style path) so history spans all repos a user works in..diffwarden/reports/in the repo) for users who want history committed/colocated with the project. This should be configurable per-repo.Session pairing (stretch / likely out of scope for v1)
The most interesting analysis would pair a report to the coding session that triggered the review (e.g. attribute findings to the agent run / change that produced them). That's likely too much to build into the CLI directly — the CLI doesn't own session context.
Proposal: have the report capture enough raw context (timestamp, target, commit/branch, cwd, diff identity) that an external analysis script can reconstruct session association after the fact, rather than trying to thread live session info through the CLI.
Open questions
reportingblock in config vs. CLI flags (--report,--report-dir,--report-scope global|repo)?diffwarden reportsubcommand for basic aggregation, or do we ship only the raw store + document the external-script path?Out of scope (for this issue)