Merged
Conversation
e59c6a3 to
71fe112
Compare
bd8c706 to
0e692e9
Compare
3e13508 to
c687808
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #62 +/- ##
=======================================
Coverage 97.22% 97.22%
=======================================
Files 33 33
Lines 1117 1117
Branches 319 319
=======================================
Hits 1086 1086
Misses 31 31 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR was opened by the Changesets release GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated.
Releases
eyecite-ts@0.5.0
Minor Changes
#70
9acdc81Thanks @medelman17! - Add Illinois ILCS chapter-act citation extraction and remove legacy state-code pattern.chapter-actpattern andextractChapterActextractor for "735 ILCS 5/2-1001" formatstate-codepattern — fully superseded by named-code + abbreviated-code families#69
4efd7fcThanks @medelman17! - Add state statute citation extraction for 19 jurisdictions across three pattern families:Federal (PR feat: robust federal statute extraction with subsections, et seq., prose form #67): Enhanced USC/CFR patterns with subsection capture, et seq., §§ ranges. Added prose-form "section X of title Y". Refactored extractStatute into dispatcher architecture.
Abbreviated-code (PR feat: abbreviated-code state statute extraction for 12 jurisdictions #68): Added knownCodes registry and extraction for 12 states using compact abbreviations: FL, OH, MI, UT, CO, WA, NC, GA, PA, IN, NJ, DE.
Named-code (PR feat: named-code state statute extraction for 7 jurisdictions #69): Added extraction for 7 states using jurisdiction prefix + code name: NY (21 laws), CA (29 codes), TX (29 codes), MD (36 articles), VA, AL, MA (chapter-based).
New
StatuteCitationfields:subsection,jurisdiction,pincite,hasEtSeq. SharedparseBodyhelper for section/subsection/et seq splitting. ~970 tests (up from 528).Patch Changes
#64
a128e50Thanks @medelman17! - Fix lookahead pattern bugs Pincite footnote reference (n.3) prevents year extraction #52 and Multiple pincite ranges (152-53, 163-64) prevent year extraction #53 that prevented year extraction:These fixes improve year extraction accuracy for citations with complex pincite formats, promoting 2 test cases from known limitations to passing tests.
#65
62578d7Thanks @medelman17! - Fix multi-word state reporters bug Support multi-word state reporters (Ohio St. 3d, Md. App.) #45 that prevented matching reporters like "Ohio St. 3d" and "Md. App.":(?! L\.[JQR\s])to prevent misclassifying journal citations like "Yale L.J." as case citationsThis fix improves tokenization accuracy for state reporters with multi-word names and ensures journal citations remain correctly classified.
#66
4c26c7bThanks @medelman17! - Fix neutral citation type classification bugs State vendor-neutral citations (2007 UT 49) misclassified as case #50 and U.S. App. LEXIS and U.S. Dist. LEXIS not recognized as neutral #51:Bug State vendor-neutral citations (2007 UT 49) misclassified as case #50: State vendor-neutral citations like "2007 UT 49", "2017 WI 17", "2013 IL 112116" now correctly classified as "neutral" type instead of "case"
Bug U.S. App. LEXIS and U.S. Dist. LEXIS not recognized as neutral #51: U.S. App./Dist. LEXIS citations like "2021 U.S. App. LEXIS 12345" and "2021 U.S. Dist. LEXIS 67890" now matched as neutral citations
Promotes 7 test cases from known limitations to passing tests. Improves accuracy for state and federal neutral citation extraction.
#63
8485b1dThanks @medelman17! - Fix four tokenization pattern bugs discovered during corpus testing:These fixes promote 4 test cases from known limitations to passing tests, improving extraction coverage for federal citations with variant formats.
#61
61152d1Thanks @medelman17! - Fix three quick-win bugs discovered during corpus testing:§§(double section symbol) crashing extractStatute by updating regex to accept one or more section symbols§→ §,&→ &, etc.) to cleaning pipelineThese fixes promote 3 test cases from known limitations to passing tests, improving extraction accuracy for real-world legal documents with HTML entities and Unicode punctuation.