Expand matcher regression fixtures by koinsaari · Pull Request #95 · InWheelOrg/inwheel-api

koinsaari · 2026-06-09T13:04:27Z

Summary

Grows internal/identity/testdata/match_fixtures.json from 8 to 18 entries, covering distance boundaries, name variations, genuine ambiguity, category distractor, and address mismatch.
fakeRepo now applies the category compat filter, so fixtures can include incompatible distractors and verify blocking excludes them.
Drops confidence-band assertions (MinConfidence / MaxConfidence). Specific scores shift on every weight tweak; the regression suite asserts Kind + MatchedPlaceID only.
New internal/identity/testdata/README.md explains the entry format, assertion policy, what is covered, and how to add a new entry.

Closes #73. Per-source realism fixtures are tracked separately under #94.

Test plan

go test ./... — 264 passed across 17 packages
go vet ./... — clean
go test ./internal/identity/ -run TestMatch_Fixtures -v — all 18 fixtures pass

🤖 Generated with Claude Code

Grow the source-agnostic regression set from 8 to 18 entries and document it. Drop confidence-band assertions: specific scores shift on any weight tweak and are noise for a regression suite. - fakeRepo now applies the compat filter so fixtures can include category-incompatible distractors and verify blocking excludes them. - Added coverage for distance boundary (5/35/49/51 m), name variations (business suffix, word reorder, partial overlap), genuine ambiguity, category distractor, and address mismatch. - New testdata/README.md explains the entry format, assertion policy, and how to add a new entry. Closes #73. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

claude · 2026-06-09T13:04:43Z

Claude finished @koinsaari's task in 4m 28s —— View job

PR #95 Review: Expand matcher regression fixtures

Read REVIEW.md and CLAUDE.md for instructions
Gather context (diff, changed files, scorer/normalizer source)
Verify fixture math against scoring constants
Post findings
Commit status — skipped (403: token lacks statuses:write scope)

No blocking issues.

What was checked

fakeRepo category filter (matcher_test.go:30–45): The old fakeRepo returned f.candidates unfiltered, meaning the old "category incompatible" fixture had to hand-supply an empty candidate slice. The new implementation builds the allowed set from the cats argument (which Match populates via Compatible(r.Category)) and filters before returning — mirroring exactly what the real FindCandidates DB call does. All six pre-existing unit tests continue to pass this change because they all use compatible category pairs. The TestMatch_PropagatesRepoError early-return is also correct.

Confidence-band removal (matcher_test.go:42–51): Dropping MinConfidence/MaxConfidence from the fixture struct is intentional and sound. The unit tests (TestMatch_LowConfidenceAttachInMiddleBand, TestMatch_ConfidentAttachOnHighScore) still assert concrete confidence ranges, so regressions in score magnitude will still be caught; the fixture suite focuses on the decision band. The README explains this clearly.

Fixture math verification (score.go constants: RadiusM=50, ConfidentThreshold=0.80, LowConfidenceThreshold=0.55, weights 0.5/0.4/0.1; redistribution gives wDist≈0.5556, wName≈0.4444):

Fixture	Offset	distScore	nameScore (Jaccard)	combined	Expected ✓
5 m boundary	−0.000045°→5.0 m	0.90	1.0	0.944	confident ✓
35 m boundary	−0.000315°→35.0 m	0.30	1.0	0.611	low_confidence ✓
49 m boundary	−0.000441°→49.0 m	0.021	1.0	0.456	no_match ✓
51 m boundary	−0.000459°→51.0 m	0 (clamped)	1.0	0.444	no_match ✓
partial token overlap	coincident	1.0	1/3≈0.333	0.704	low_confidence ✓
ambiguity p1 ("Pascal")	−0.00018°→20 m	0.60	1.0	0.778	low_confidence ✓
ambiguity p2 ("Pascal Cafe")	−0.000045°→5 m	0.90	0.5	0.722	(loses to p1) ✓
category distractor p2	−0.00009°→10 m	0.80	1.0	0.889	confident ✓
address mismatch	0 m	1.0	1.0, addr=0 (present)	0.90	confident ✓

All 18 fixtures are mathematically consistent with the scorer.

Normalization correctness: businessSuffixes contains {inc, ltd, llc, co, corp, gmbh, oy, ab}. "cafe", "bistro", and "pharmacy" are NOT in the list — so "Café Pascal Bistro" normalizes to {cafe, pascal, bistro} giving Jaccard 1/3 vs {pascal}, and "Pascal Cafe" normalizes to {pascal, cafe} giving Jaccard 1/2 vs {pascal}. Both correct for their fixture expectations.

README (testdata/README.md): Content is accurate and actionable. The guidance to compute expected outcomes by hand from score.go constants, use the same coordinate neighbourhood, and run go test ./internal/identity/ -run TestMatch_Fixtures -v is the right workflow to document.

🟡 Nit

The "ambiguity" fixture name (match_fixtures.json:126) says "stronger name signal beats slightly-closer noisier candidate" but doesn't hint that both candidates land in low_confidence (neither reaches 0.80). A reader must do the math to see why the overall Kind is low_confidence and not confident. Consider appending something like "→ both low_confidence, p1 wins argmax" to the name, or adding a note to the README coverage list.

koinsaari merged commit ef60b78 into main Jun 9, 2026
9 checks passed

koinsaari deleted the tests/expand-matcher-regression-fixtures branch June 9, 2026 13:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Expand matcher regression fixtures#95

Expand matcher regression fixtures#95
koinsaari merged 1 commit into
mainfrom
tests/expand-matcher-regression-fixtures

koinsaari commented Jun 9, 2026

Uh oh!

claude Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

koinsaari commented Jun 9, 2026

Summary

Test plan

Uh oh!

claude Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR #95 Review: Expand matcher regression fixtures

What was checked

🟡 Nit

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude Bot commented Jun 9, 2026 •

edited

Loading