-
Notifications
You must be signed in to change notification settings - Fork 3
Release v0.6 #150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Release v0.6 #150
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
aefc8b8
Preliminary combined date parser
rlskoeser acda05f
Merge branch 'hotfix/0.5.2' into develop
rlskoeser ffd70a1
Set develop version back to 0.6.dev0
rlskoeser 09d90f3
chore: update workflow actions
rettinghaus d155d25
Merge pull request #147 from rettinghaus/actions
rlskoeser b5cb6e3
docs: update CONTRIBUTORS.md [skip ci]
allcontributors[bot] 4c57267
docs: update .all-contributorsrc [skip ci]
allcontributors[bot] ceec78b
Merge pull request #148 from dh-tech/all-contributors/add-rettinghaus
rlskoeser 2fc55f7
Move parser grammar files to common location; simplify combined parser
rlskoeser 25137cb
Add, document, & test omnibus converter
rlskoeser 7a99c5c
Add test case for unsupported serialization
rlskoeser e655936
Merge branch 'develop' into feature/combined-parser
rlskoeser 0691b12
Add tests for error cases
rlskoeser ce5baaa
Add brief overview docstring for converter module
rlskoeser 96cb8d6
Merge pull request #112 from dh-tech/feature/combined-parser
rlskoeser 9e9da0c
Set version to 0.6 and document changes
rlskoeser 1f0f7c0
Fix typo flagged by @coderabbitai
rlskoeser File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1 +1,29 @@ | ||
| from undate.converters.base import BaseDateConverter as BaseDateConverter | ||
| """ | ||
| Converter classes add support for parsing and serializing dates | ||
| in a variety of formats. A subset of these are calendar converters | ||
| (:mod:`undate.converters.calendar`), which means they support both parsing | ||
| and conversion from an alternate calendar to a common Gregorian | ||
| for comparison across dates. | ||
|
|
||
| To parse a date with a supported converter, use the ``Undate`` class method | ||
| :meth:`~undate.undate.Undate.parse` and specify the date as a string | ||
| with the desired format or calendar, e.g. | ||
|
|
||
| .. code-block:: | ||
|
|
||
| Undate.parse("2001-05", "EDTF") | ||
| Undate.parse("7 Heshvan 5425", "Hebrew") | ||
|
|
||
| For converters that support it, you can also serialize a date in a specified | ||
| format with ``Undate`` class method :meth:`~undate.undate.Undate.format`: | ||
|
|
||
| .. code-block:: | ||
|
|
||
| Undate.parse("Rabīʿ ath-Thānī 343", "Islamic").format("EDTF") | ||
|
|
||
|
|
||
| """ | ||
|
|
||
| from undate.converters.base import BaseDateConverter, GRAMMAR_FILE_PATH | ||
|
|
||
| __all__ = ["BaseDateConverter", "GRAMMAR_FILE_PATH"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| """ | ||
| **Experimental** combined parser. Supports EDTF, Hebrew, and Hijri | ||
| where dates are unambiguous. (Year-only dates are parsed as EDTF in | ||
| Gregorian calendar.) | ||
| """ | ||
|
|
||
| from typing import Union | ||
|
|
||
| from lark import Lark | ||
| from lark.exceptions import UnexpectedCharacters | ||
| from lark.visitors import Transformer, merge_transformers | ||
|
|
||
| from undate import Undate, UndateInterval | ||
| from undate.converters import BaseDateConverter, GRAMMAR_FILE_PATH | ||
| from undate.converters.edtf.transformer import EDTFTransformer | ||
| from undate.converters.calendars.hebrew.transformer import HebrewDateTransformer | ||
| from undate.converters.calendars.islamic.transformer import IslamicDateTransformer | ||
|
|
||
|
|
||
| class CombinedDateTransformer(Transformer): | ||
| def start(self, children): | ||
| # trigger the transformer for the appropriate part of the grammar | ||
| return children | ||
|
|
||
|
|
||
| # NOTE: currently year-only dates in combined parser are interpreted as | ||
| # EDTF and use Gregorian calendar. | ||
| # In future, we could refine by adding calendar names & abbreviations | ||
| # to the parser in order to recognize years from other calendars. | ||
|
|
||
| combined_transformer = merge_transformers( | ||
| CombinedDateTransformer(), | ||
| edtf=EDTFTransformer(), | ||
| hebrew=HebrewDateTransformer(), | ||
| islamic=IslamicDateTransformer(), | ||
| ) | ||
|
|
||
|
|
||
| # open based on filename so we can specify relative import path based on grammar file | ||
| parser = Lark.open( | ||
| str(GRAMMAR_FILE_PATH / "combined.lark"), rel_to=__file__, strict=True | ||
| ) | ||
|
|
||
|
|
||
| class OmnibusDateConverter(BaseDateConverter): | ||
| """ | ||
| Combination parser that aggregates existing parser grammars. | ||
| Currently supports EDTF, Hebrew, and Hijri where dates are unambiguous. | ||
| (Year-only dates are parsed as EDTF in Gregorian calendar.) | ||
|
|
||
| Does not support serialization. | ||
|
|
||
| Example usage:: | ||
|
|
||
| Undate.parse("Tammuz 4816", "omnibus") | ||
|
|
||
| """ | ||
|
|
||
| #: converter name: omnibus | ||
| name: str = "omnibus" | ||
|
|
||
| def __init__(self): | ||
| self.transformer = combined_transformer | ||
|
|
||
| def parse(self, value: str) -> Union[Undate, UndateInterval]: | ||
| """ | ||
| Parse a string in a supported format and return an :class:`~undate.undate.Undate` | ||
| or :class:`~undate.undate.UndateInterval`. | ||
| """ | ||
| if not value: | ||
| raise ValueError("Parsing empty/unset string is not supported") | ||
|
|
||
| # parse the input string, then transform to undate object | ||
| try: | ||
| parsetree = parser.parse(value) | ||
| # transform returns a list; we want the first item in the list | ||
| return self.transformer.transform(parsetree)[0] | ||
| except UnexpectedCharacters: | ||
| raise ValueError( | ||
| "Parsing failed: '%s' is not in a recognized date format" % value | ||
| ) | ||
|
|
||
| def to_string(self, undate: Union[Undate, UndateInterval]) -> str: | ||
| "Not supported by this converter. Will raise :class:`ValueError`" | ||
| raise ValueError("Omnibus converter does not support serialization") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,8 @@ | ||
| import pathlib | ||
|
|
||
| from lark import Lark | ||
|
|
||
| grammar_path = pathlib.Path(__file__).parent / "edtf.lark" | ||
| from undate.converters import GRAMMAR_FILE_PATH | ||
|
|
||
| grammar_path = GRAMMAR_FILE_PATH / "edtf.lark" | ||
|
|
||
| with open(grammar_path) as grammar: | ||
| edtf_parser = Lark(grammar.read(), start="edtf") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| %import common.WS | ||
| %ignore WS | ||
|
|
||
| start: (edtf__start | hebrew__hebrew_date | islamic__islamic_date ) | ||
|
|
||
| // Renaming of the import variables is required, as they receive the namespace of this file. | ||
| // See: https://github.com/lark-parser/lark/pull/973#issuecomment-907287565 | ||
|
|
||
| // All grammars are in the same file, so we can use relative imports | ||
|
|
||
| // relative import from edtf.lark | ||
| %import .edtf.edtf -> edtf__start | ||
|
|
||
| // relative import from hebrew.lark | ||
| %import .hebrew.hebrew_date -> hebrew__hebrew_date | ||
| %import .hebrew.day -> hebrew__day | ||
| %import .hebrew.month -> hebrew__month | ||
| %import .hebrew.year -> hebrew__year | ||
|
|
||
| // relative import from islamic.lark | ||
| %import .islamic.islamic_date -> islamic__islamic_date | ||
| %import .islamic.day -> islamic__day | ||
| %import .islamic.month -> islamic__month | ||
| %import .islamic.year -> islamic__year | ||
|
|
||
|
|
||
| // override hebrew date to omit year-only, since year without calendar is ambiguous | ||
| // NOTE: potentially support year with calendar label | ||
| %override hebrew__hebrew_date: hebrew__day hebrew__month hebrew__year | hebrew__month hebrew__year | ||
|
|
||
| // same for islamic date, year alone is ambiguous | ||
| %override islamic__islamic_date: islamic__day islamic__month islamic__year | islamic__month islamic__year |
File renamed without changes.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.