test+refactor: adapter contract testkit, registry allowlist, test coverage (#8, #18, #19, #20)#30
Conversation
…yFl0wer#19) The ABC default returning [] was added in 3754e8a as part of Cat 9 scaffolding, but the upstream issue tracks two follow-throughs that were still missing: - Module docstring + class docstring in `base.py` claimed "Two optional" methods; promote to "Three optional" so readers see `get_harness_manifest` as part of the contract. - Spec § Adapter Interface (sme_spec_v8.md) listed only `get_flat_retrieval` and `get_ontology_source` as optional. Add `get_harness_manifest` alongside them with its default-list rationale. Adds tests/test_adapter_harness_manifest_contract.py: every shipped adapter must resolve `get_harness_manifest` (either inherited or overridden) and the ABC default must return []. Class-level checks avoid instantiating adapters with heavy constructors. Closes the silent-AttributeError class entirely for any future adapter that forgets to override.
…Fl0wer#8) Parametric test module verifying adapters conform to the SMEAdapter ABC: query returns QueryResult, graph snapshot is internally consistent, ingest accepts corpus shape. Covers FlatBaselineAdapter and a MockAdapter; other adapters opt in by registering a factory.
…keyFl0wer#18) Three new test files for measurement-critical modules that had zero direct tests. Covers hop-bucket grouping, Cat 8 sub-tests, and the shared graph projection helper. Prevents silent regressions in published category readings.
Remove HindsightAdapter, Mem0Adapter, OmegaAdapter, RlmAdapter imports since those adapter files don't exist in upstream repo. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Tested locally in a worktree: ruff clean, 338 passed / 15 skipped, the 146-new-test claim verifies exactly. The One forward-looking note (not blocking): when new adapters land in subsequent PRs (#32 adds two — random + oracle), the contract testkit needs them parametrized in or the "27 conformance tests" claim quietly stops being true. Worth a one-line note in CONTRIBUTING about that. I'll add it. Merging this first. Everything else in the stack will benefit from rebasing onto post-#30 main. |
|
One coordination note as I work through the stack — five PRs in this wave (#31, #37, #38, #39, #40) touch `sme/categories/ingestion_integrity.py` or `sme/categories/gap_detection.py`. If they're all branched from pre-#30 main, serial merge will hit conflicts from the third PR onward. Would you be up for rebasing the stack onto post-#30 main in one pass once this lands, so the rest land cleanly? Happy to do it from this end if easier — your call. Also flagging: when #32 lands (RandomRetrievalAdapter + OracleRetrievalAdapter), the contract testkit parametrize list needs both adapters added or the "27 conformance tests" invariant silently degrades. Mentioned it on #32 too — flagging here since this is the foundation that enforces the invariant. |
|
Following up on the coordination point I raised on Monday — now that the #31/#33/#34/#35/#36/#37 fixes are all green and ready, the file-overlap question is back on the front burner. Five of the seven open PRs touch I'm not precious about the order; mostly I want to confirm a sequence so we don't end up doing rebase coordination cold-cache next week. Tentative proposal, push back if you'd prefer different:
That keeps the rebase work bounded to the actual file overlaps (#37 onto #31, and the #38/#39/#40 stack). Open to a different order if you have a preference — happy to be wrong about which to land first. Separately, #32 (random+oracle) I'm leaving as its own conversation — see the comment I just left there. |
Summary
Addresses four open code-quality issues from this repo:
SMEAdapterABC (query()returnsQueryResult,get_graph_snapshot()is internally consistent, etc.)multi_hop,ontology_coherence,_graph_mapping: 101 new tests covering hop-bucket grouping, A/B/C deltas, Cat 8a–8e sub-tests, andproject_graph()edge casesget_harness_manifest()as ABC contract: docstring + spec update + 3 regression tests catching the silent-AttributeError class_load_adapter()drop-list to allowlist registry: frozen_AdapterSpecdataclass withaccepts: frozenset[str]andrename: dict[str, str]; unknown kwargs silently dropped, preventing the PR feat(adapters): RlmAdapter + Qwen-7B/Llama-70B baselines #7 drop-list-drift regression class. 15 new tests.146 new tests total, all passing.
Test plan
python -m pytest tests/test_adapter_contract.py tests/test_adapter_harness_manifest_contract.py tests/test_cli_adapter_forwarding.py tests/test_graph_mapping.py tests/test_multi_hop.py tests/test_ontology_coherence.py -vpython -m pytest tests/ -x -qsme-eval --adapter flat --db /tmp/teststill resolves through the new registry🫏 Generated with Claude Code