@@ -20,19 +20,23 @@ sample-runs/
2020 2026-03-15T10-00-00-000Z/index.jsonl
2121```
2222
23- These are canonical run directories with ` index.jsonl ` , so you can point ` agentv trend ` at them directly .
23+ These are canonical run directories with ` index.jsonl ` .
2424
2525## End-User Flow
2626
27- From this directory, run:
27+ Most real users will run ` trend ` against their latest eval history with ` --last ` .
28+
29+ To reproduce that flow from this example directory, first copy the sample runs into the normal runtime layout:
2830
2931``` bash
30- bun ../../../apps/cli/src/cli.ts trend \
31- sample-runs/2026-03-01T10-00-00-000Z \
32- sample-runs/2026-03-08T10-00-00-000Z \
33- sample-runs/2026-03-15T10-00-00-000Z \
34- --dataset code-review \
35- --target claude-sonnet
32+ mkdir -p .agentv/results/runs
33+ cp -R sample-runs/* .agentv/results/runs/
34+ ```
35+
36+ Then run:
37+
38+ ``` bash
39+ bun ../../../apps/cli/src/cli.ts trend --last 3 --dataset code-review --target claude-sonnet
3640```
3741
3842Expected output:
@@ -56,9 +60,22 @@ Regression Gate: threshold=0.010 fail_on_degrading=false triggered=false
5660
5761Interpretation:
5862
59- - The command uses the matched intersection of test IDs across all runs.
60- - Mean score declines each run, so the slope is negative.
61- - The verdict is ` degrading ` .
63+ - The command auto-discovers the most recent three runs.
64+ - It filters to ` dataset=code-review ` and ` target=claude-sonnet ` .
65+ - It intersects matched test IDs across runs and detects a steady downward score trend.
66+
67+ ## Explicit Inputs
68+
69+ If you want to see the same analysis without copying files into ` .agentv/results/runs/ ` , point ` trend ` at the sample runs directly:
70+
71+ ``` bash
72+ bun ../../../apps/cli/src/cli.ts trend \
73+ sample-runs/2026-03-01T10-00-00-000Z \
74+ sample-runs/2026-03-08T10-00-00-000Z \
75+ sample-runs/2026-03-15T10-00-00-000Z \
76+ --dataset code-review \
77+ --target claude-sonnet
78+ ```
6279
6380## CI Gate Example
6481
@@ -76,18 +93,3 @@ bun ../../../apps/cli/src/cli.ts trend \
7693```
7794
7895This exits with code ` 1 ` because the degrading slope magnitude exceeds ` 0.01 ` .
79-
80- ## ` --last ` Workflow
81-
82- If you want to test the exact runtime layout used by ` agentv eval ` , copy the sample runs into ` .agentv/results/runs/ ` first:
83-
84- ``` bash
85- mkdir -p .agentv/results/runs
86- cp -R sample-runs/* .agentv/results/runs/
87- ```
88-
89- Then run:
90-
91- ``` bash
92- bun ../../../apps/cli/src/cli.ts trend --last 3 --dataset code-review --target claude-sonnet
93- ```
0 commit comments