Add edge-case tests for cross-table lookup (#136)#142
Merged
amc-corey-cox merged 1 commit intocross-table-lookupfrom Mar 9, 2026
Merged
Add edge-case tests for cross-table lookup (#136)#142amc-corey-cox merged 1 commit intocross-table-lookupfrom
amc-corey-cox merged 1 commit intocross-table-lookupfrom
Conversation
Cover four gaps in test coverage for the LookupIndex and transform_spec engine introduced in PR #136: - Duplicate keys: verify LIMIT 1 first-match semantics and document that the returned row is non-deterministic for non-unique keys - Empty secondary tables: headers-only TSV files register and query cleanly, returning None on lookup - LookupIndex lifecycle: close() clears tables, operations after close() raise, double-close is safe - Engine no-joins regression: transform_spec works correctly when class_derivations have no joins block (common case) - Mixed derivations: joins and non-joins class_derivations coexist All 11 tests pass on the cross-table-lookup branch. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This was referenced Mar 9, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds targeted edge-case tests to document and lock in existing behavior for the DuckDB-backed LookupIndex and the transform_spec engine introduced with cross-table lookup support.
Changes:
- Add
LookupIndexedge-case tests covering duplicate keys, empty tables, and lifecycle behavior afterclose(). - Add
transform_specedge-case tests covering no-joins derivations, empty joined tables, and mixed (joins + non-joins) derivations in one spec.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
tests/test_utils/test_lookup_index_edge_cases.py |
Adds 7 tests for duplicate-key semantics, empty secondary tables, and post-close() behavior of LookupIndex. |
tests/test_transformer/test_engine_edge_cases.py |
Adds 4 tests to prevent regressions in transform_spec for no-joins, empty-join inputs, and mixed derivation specs. |
4 tasks
amc-corey-cox
approved these changes
Mar 9, 2026
Contributor
amc-corey-cox
left a comment
There was a problem hiding this comment.
Yep, this looks good to me.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 11 tests covering edge cases in the
LookupIndexandtransform_specengine introduced in #136. All tests pass on thecross-table-lookupbranch — these document and lock in existing behavior.Why these tests
LIMIT 1returns a row when multiple rows share a keyORDER BY. Tests document this so any future change to query ordering is caught.close()clears state, post-close ops raise, double-close is safetransform_specworks whenclass_derivationhas nojoins:blocktransform_spec.class_derivationblocks coexist in one specTest files
tests/test_utils/test_lookup_index_edge_cases.py— 7 tests forLookupIndextests/test_transformer/test_engine_edge_cases.py— 4 tests fortransform_specNote on duplicate-key semantics
The
test_duplicate_keys_returns_a_rowtest intentionally does NOT assert which duplicate row is returned — it only verifies that a row comes back and that the key matches. The docstring documents that DuckDB's storage order makes this deterministic in practice, but the API doesn't guarantee it. If you'd prefer to make this a documented constraint (e.g., "first row wins") or add a warning/error for non-unique keys, I'm happy to adjust the tests accordingly.Test plan
cross-table-lookup🤖 Generated with Claude Code