Bug-fix pass: ~30 verified fixes from public-API stress test (prep for 0.1.3)#4
Open
jm-rivera wants to merge 6 commits into
Open
Bug-fix pass: ~30 verified fixes from public-API stress test (prep for 0.1.3)#4jm-rivera wants to merge 6 commits into
jm-rivera wants to merge 6 commits into
Conversation
…PI stress test (0.1.3) Fixes ~30 verified bugs across resolution behavior (dotted abbreviations, mixed-case routing, iso_numeric padding, snap label candidates, punctuation no-match), crashes (polars bulk, record output, pickling), eager parameter validation with did-you-mean across all enum-like kwargs, candidate-aware ambiguity hints, configure() sentinel semantics, query-cache mutation isolation, BYOD normalization alignment, and stale docs. Suite: 3717 passed. Geo eval gate: 0.8856 >= 0.8200.
…arisons Clean CI has no admin tiers, so 'D.C.' cannot reach geoId/11001 there; the org-suppression half of the assertion stays ungated. Cache-dir tests now compare Path objects, not POSIX string literals. Refresh committed benchmark results for the new matching behavior.
…s, accessor on_error, alpha-3 hints, error exports Scalar resolve()/resolve_id() coerce int/float through the shared bulk helper and raise the documented TypeError otherwise. ResolutionResult sequence fields are tuples (frozen contract now real). pandas/polars accessors expose on_error (default 'raise') and propagate validation errors un-garbled. ResolutionContext.country accepts alpha-3 via a static ISO table (no runtime pycountry). Eight commonly raised errors promoted to top-level __all__. configure() sentinel semantics unified across all parameters (omitted = unchanged, None = reset). Two Opus review passes: pycountry-at-query-time blocker fixed, auto_download None inconsistency fixed, exception-path split documented. Suite: 3787 passed.
…iled index cache The first query past the exact-match tiers paid the lazy SymSpell build inline (~6s for the geo large tier on remote-data installs). Now: - Resolver construction starts a daemon thread that pre-builds lazy indexes (warm=True default on all constructors; warm=False opts out) - rk.warm() / Resolver.warm(): synchronous, idempotent readiness API - The built large-tier index is cached as a locally-generated pickle under <cache-dir>/compiled/, keyed by dict files + symspellpy version; next process loads in ~1.4s instead of rebuilding for ~6s - After-fork hook resets build locks so a fork during warm-up cannot deadlock the child; unique temp names + age-gated reaping make concurrent same-process cache writes safe Opus-reviewed (two findings, both fixed and re-confirmed).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes small bugs