Skip to content

Commit 0a1ae10

Browse files
committed
docs: document test pyramid for ways matching system
- tests/README.md: Full guide covering all three test layers (fixture, integration, activation), baselines, /test-way skill, and when-to-run - docs/hooks-and-ways.md: Add Testing section with summary table - tools/way-match/test-integration.sh: Fix stale match: field check (use description+vocabulary presence instead), fix default threshold from NCD 0.58 to BM25 2.0
1 parent 40317be commit 0a1ae10

3 files changed

Lines changed: 81 additions & 13 deletions

File tree

docs/hooks-and-ways.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -409,3 +409,19 @@ flowchart TD
409409
```
410410

411411
Checked by `show-way.sh` before outputting any way.
412+
413+
## Testing
414+
415+
Three test layers verify the matching and injection pipeline. See [tests/README.md](../tests/README.md) for full details.
416+
417+
| Layer | Command | What it tests |
418+
|-------|---------|---------------|
419+
| **Fixture** | `bash tools/way-match/test-harness.sh` | BM25 vs NCD scorer accuracy (32 prompts, fixed corpus) |
420+
| **Integration** | `bash tools/way-match/test-integration.sh` | Real way files, frontmatter extraction, multi-way discrimination |
421+
| **Activation** | `read and run the activation test at tests/way-activation-test.md` | Live hook pipeline: regex, BM25, negative control, subagent injection |
422+
423+
The `/test-way` skill provides ad-hoc scoring for vocabulary tuning:
424+
425+
```
426+
/test-way "write some unit tests for this module"
427+
```

tests/README.md

Lines changed: 61 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,71 @@
1-
# Integration Tests
1+
# Testing the Ways System
22

3-
## Way Activation Test
3+
Three test layers, from fast/automated to slow/interactive.
44

5-
Tests that contextual hooks fire correctly for parent agents and subagents.
5+
## 1. Fixture Tests (BM25 vs NCD scorer comparison)
6+
7+
Runs 32 test prompts against a fixed 7-way corpus. Compares BM25 binary against gzip NCD baseline. Reports TP/FP/TN/FN for each scorer.
8+
9+
```bash
10+
bash tools/way-match/test-harness.sh
11+
```
12+
13+
Options: `--bm25-only`, `--ncd-only`, `--verbose`
14+
15+
**What it covers**: Scorer accuracy, false positive rate, head-to-head comparison. Tests direct vocabulary matches, synonym/paraphrase variants, and negative controls.
16+
17+
**Current baseline**: BM25 26/32, NCD 24/32, 0 FP for both.
18+
19+
## 2. Integration Tests (real way files)
20+
21+
Scores 31 test prompts against actual `way.md` files extracted from the live ways directory. Tests the real frontmatter extraction pipeline.
22+
23+
```bash
24+
bash tools/way-match/test-integration.sh
25+
```
26+
27+
**What it covers**: End-to-end scoring with real way vocabulary, multi-way discrimination (does the right way win?), threshold behavior with actual threshold values.
28+
29+
**Current baseline**: BM25 27/31 (0 FP), NCD 15/31 (3 FP).
30+
31+
## 3. Activation Test (live agent + subagent)
32+
33+
Interactive test protocol that verifies the full hook pipeline in a running Claude Code session. Tests regex matching, BM25 semantic matching, negative controls, and subagent injection.
634

735
**To run**: Start a fresh session from `~/.claude/` and type:
836

937
```
1038
read and run the activation test at tests/way-activation-test.md
1139
```
1240

13-
Claude reads the test protocol, then walks you through 7 steps:
14-
- Steps 1, 5, 6, 7: Claude-only (just watch)
15-
- Steps 2, 3, 4: You type specific phrases when prompted
41+
Claude reads the test file (avoiding prompt-hook contamination), then walks you through 7 steps:
42+
43+
| Step | Who | Tests |
44+
|------|-----|-------|
45+
| 1 | Claude | Session baseline (no premature domain activation) |
46+
| 2 | User types prompt | Regex pattern matching (commits way) |
47+
| 3 | User types prompt | BM25 semantic matching (security way) |
48+
| 4 | User types prompt | Negative control (no false positives) |
49+
| 5 | Claude | Subagent injection (Testing Way via SubagentStart) |
50+
| 6 | Claude | Subagent negative (no injection on irrelevant prompt) |
51+
| 7 | Claude | Summary table |
52+
53+
Takes about 3 minutes. **Current baseline**: 6/6 PASS.
54+
55+
## Ad-Hoc Vocabulary Testing
56+
57+
The `/test-way` skill scores a prompt against all semantic ways and reports BM25 scores. Use it during vocabulary tuning to check discrimination between ways.
58+
59+
```
60+
/test-way "write some unit tests for this module"
61+
```
62+
63+
## When to Run Which
1664

17-
Takes about 3 minutes.
65+
| Scenario | Test |
66+
|----------|------|
67+
| Changed `way-match.c` or rebuilt binary | Fixture tests + integration tests |
68+
| Changed a way's vocabulary or threshold | Integration tests + `/test-way` |
69+
| Changed hook scripts (check-*.sh, inject-*.sh, match-way.sh) | Activation test |
70+
| Added a new way | Integration tests + `/test-way` + activation test |
71+
| Sanity check after merge | All three |

tools/way-match/test-integration.sh

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,9 +33,6 @@ echo "Scanning for semantic ways..."
3333
echo ""
3434

3535
while IFS= read -r wayfile; do
36-
match_mode=$(sed -n 's/^match: *//p' "$wayfile")
37-
[[ "$match_mode" != "semantic" ]] && continue
38-
3936
# Derive way ID from path
4037
rel=$(echo "$wayfile" | sed "s|$WAYS_DIR/||;s|/way\.md$||")
4138
way_id=$(echo "$rel" | tr '/' '-')
@@ -44,14 +41,15 @@ while IFS= read -r wayfile; do
4441
vocab=$(sed -n 's/^vocabulary: *//p' "$wayfile")
4542
thresh=$(sed -n 's/^threshold: *//p' "$wayfile")
4643

47-
[[ -z "$desc" ]] && continue
44+
# Skip ways without semantic matching fields
45+
[[ -z "$desc" || -z "$vocab" ]] && continue
4846

4947
WAY_DESC[$way_id]="$desc"
5048
WAY_VOCAB[$way_id]="$vocab"
51-
WAY_THRESH[$way_id]="${thresh:-0.58}"
49+
WAY_THRESH[$way_id]="${thresh:-2.0}"
5250
WAY_PATH[$way_id]="$wayfile"
5351

54-
printf " %-30s thresh=%-5s %s\n" "$way_id" "${thresh:-0.58}" "$(echo "$desc" | cut -c1-60)"
52+
printf " %-30s thresh=%-5s %s\n" "$way_id" "${thresh:-2.0}" "$(echo "$desc" | cut -c1-60)"
5553
done < <(find "$WAYS_DIR" -name "way.md" -type f | sort)
5654

5755
echo ""

0 commit comments

Comments
 (0)