-
Notifications
You must be signed in to change notification settings - Fork 4
Fix/update docs #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Fix/update docs #17
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,43 +1,76 @@ | ||
| ## What does this PR do? | ||
|
|
||
| <!-- One-sentence summarizing of the change. --> | ||
| <!-- One sentence: what changed and why. --> | ||
|
|
||
| ## Type of change | ||
|
|
||
| - [ ] New language (`languages/{lang}/`) | ||
| - [ ] New step (`steps/text/` or `steps/word/`) | ||
| - [ ] New preset version (`presets/`) | ||
| - [ ] Bug fix | ||
| - [ ] Refactor / internal cleanup | ||
| - [ ] Docs / CI | ||
| - [ ] New language | ||
| - [ ] Edit existing language (fix a replacement, tweak config, …) | ||
| - [ ] New normalization step | ||
| - [ ] Edit existing step (bug fix, behaviour change) | ||
| - [ ] New preset version | ||
| - [ ] Bug fix (other) | ||
| - [ ] Refactor / docs / CI | ||
|
|
||
| --- | ||
|
|
||
| ## Checklist | ||
|
|
||
| **Only fill in the section(s) that match your change — delete the rest.** | ||
|
|
||
| --- | ||
|
|
||
| ### New language | ||
|
|
||
| - [ ] Created `languages/{lang}/` with `operators.py`, `replacements.py`, `__init__.py` | ||
| - [ ] All word-level substitutions are in `replacements.py`, not inline in `operators.py` | ||
| - [ ] Decorated operators class with `@register_language` | ||
| - [ ] Added one import line to `languages/__init__.py` | ||
| - [ ] Added unit tests in `tests/unit/languages/` | ||
| - [ ] Added a per-language CSV in `tests/e2e/files/{preset}/` (e.g. `tests/e2e/files/gladia-3/fr.csv`) | ||
| - [ ] Created `normalization/languages/{lang}/` with `operators.py`, `replacements.py`, `__init__.py` | ||
| - [ ] Word substitutions are in `replacements.py` (not hardcoded in `operators.py`) | ||
| - [ ] `LanguageConfig` is filled in with the language's data (separators, currency words, digit words, …) | ||
| - [ ] Subclassed `LanguageOperators` — only override methods where the **logic** changes, not just the data | ||
| - [ ] Class is decorated with `@register_language` and imported in `normalization/languages/__init__.py` | ||
| - [ ] Unit tests added in `tests/unit/languages/` | ||
| - [ ] E2e CSV added in `tests/e2e/files/{preset}/{lang}.csv` (e.g. `tests/e2e/files/gladia-3/fr.csv`) | ||
|
|
||
| --- | ||
|
|
||
| ### Edit existing language | ||
|
|
||
| - [ ] New/changed word substitutions go in `replacements.py`, not inline in `operators.py` | ||
| - [ ] If you changed a config field that can be `None`: the step reading it still handles `None` gracefully | ||
| - [ ] Unit tests updated or added | ||
| - [ ] E2e CSV updated if the expected output changed | ||
|
|
||
| --- | ||
|
|
||
| ### New step | ||
|
|
||
| - [ ] `name` class attribute is unique and matches the YAML key | ||
| - [ ] Decorated with `@register_step` | ||
| - [ ] Added one import line to `steps/text/__init__.py` or `steps/word/__init__.py` | ||
| - [ ] Algorithm reads data from `operators.config.*`, no hardcoded language-specific values | ||
| - [ ] Optional config fields are guarded with `if operators.config.field is None: return text` | ||
| - [ ] Placeholder protect/restore pairs are both in `steps/text/placeholders.py` and `pipeline/base.py`'s `validate()` is updated | ||
| - [ ] Added unit tests in `tests/unit/steps/` | ||
| - [ ] Added step name to relevant preset YAMLs (new preset file if existing presets are affected) | ||
| - [ ] If the class docstring was added or changed, ran `uv run scripts/generate_step_docs.py` to regenerate `docs/steps.md` | ||
| - [ ] Unique `name` class attribute set (this is the key used in YAML presets) | ||
| - [ ] Decorated with `@register_step` and imported in `steps/text/__init__.py` or `steps/word/__init__.py` | ||
| - [ ] No hardcoded language values — read data from `operators.config.*` instead | ||
| - [ ] If placeholder-based: protect + restore are both in `steps/text/placeholders.py` and `pipeline/base.py`'s `validate()` is updated | ||
| - [ ] Unit tests added in `tests/unit/steps/` | ||
| - [ ] Step name added to the relevant preset YAML — or a **new preset file** created if existing presets are affected | ||
| - [ ] If the docstring changed: ran `uv run scripts/generate_step_docs.py` | ||
|
|
||
| --- | ||
|
|
||
| ### Edit existing step | ||
|
|
||
| - [ ] Step `name` is unchanged — if the output changes, create a new step name + new preset instead | ||
| - [ ] No language-specific logic or string literals added inside the step | ||
| - [ ] Unit tests updated or added | ||
| - [ ] If the docstring changed: ran `uv run scripts/generate_step_docs.py` | ||
|
|
||
| --- | ||
|
|
||
| ### Preset change | ||
|
|
||
| - [ ] Existing preset files are not modified — new behavior uses a new preset version file | ||
| - [ ] Existing preset files are **not modified** — new behaviour goes in a new preset file | ||
| - [ ] `pipeline.validate()` passes (runs automatically via `loader.py`) | ||
|
|
||
| --- | ||
|
|
||
| ## Tests | ||
| ## How was this tested? | ||
|
|
||
| <!-- Describe what was tested and how. --> | ||
| ``` | ||
| uv run pytest tests/ | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a language to the fenced code block (markdownlint MD040).
Line 74 should specify a fence language to satisfy lint and improve readability.
Proposed fix
Verify each finding against the current code and only fix it if needed.
In @.github/pull_request_template.md around lines 74 - 76, Add a fence language
to the Markdown code block containing "uv run pytest tests/" by changing the
opening triple backticks to include a language identifier (e.g., bash) so the
fenced block becomes "```bash" to satisfy markdownlint MD040 and improve
readability; update the fenced block near the existing "uv run pytest tests/"
snippet accordingly.