feat: add validate-text-file subcommand and fix --strict for unresolvable CURIEs#13
Open
github-actions[bot] wants to merge 1 commit intomainfrom
Open
feat: add validate-text-file subcommand and fix --strict for unresolvable CURIEs#13github-actions[bot] wants to merge 1 commit intomainfrom
github-actions[bot] wants to merge 1 commit intomainfrom
Conversation
…able CURIEs Closes #12. Changes: - New `validate-text-file` CLI subcommand that reads a text/markdown file, extracts CURIE+label pairs via a user-specified regex (default: `@term CURIE "label"`), resolves each CURIE via OAK, and reports label mismatches or unresolvable identifiers. Supports --regex, --curie-group, --label-group, --strict, --config, --no-cache, --cache-dir, and --verbose options. - New `EnumValidator.validate_curie_label_pairs()` method that encapsulates the CURIE+label validation logic independently of a LinkML schema. - Bug fix: `--strict` now also treats unresolvable CURIEs with unconfigured prefixes as errors in both `validate-schema` and `validate-text-file` (previously they silently passed as INFO). - 11 new tests covering: valid terms, label mismatch, unresolvable CURIEs (configured and unconfigured prefixes), strict mode, custom regex, verbose output, no-matches warning, invalid regex, and the --strict fix for validate-schema. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #12.
Summary
validate-text-filesubcommand — validates ontology term CURIEs and labels embedded in a text/markdown file using a regex, without requiring an intermediate LinkML schema.EnumValidator.validate_curie_label_pairs()method — reusable core logic for CURIE+label pair validation independent of a schema.--strictnow errors on unresolvable CURIEs with unconfigured prefixes — previously these silently passed asINFO; with--strictthey are nowERROR. Affects bothvalidate-schemaandvalidate-text-file.Usage
Example markdown document:
Behaviour
--strict--strictTest plan
test_validate_text_file_help— help text contains all new flagstest_validate_text_file_valid_terms— 3 valid terms → exit 0test_validate_text_file_label_mismatch— wrong label → exit 1 with mismatch messagetest_validate_text_file_unresolvable_configured_prefix— nonexistent CURIE in configured ontology → exit 1test_validate_text_file_unresolvable_unconfigured_no_strict— unconfigured prefix without--strict→ exit 0 (skip)test_validate_text_file_unresolvable_unconfigured_strict— unconfigured prefix with--strict→ exit 1test_validate_text_file_custom_regex— custom regex with swapped group order → exit 0test_validate_text_file_no_matches— file with no@termlines → exit 0 with warningtest_validate_text_file_invalid_regex— bad regex → exit 1 with messagetest_validate_text_file_verbose— verbose mode shows each CURIEtest_validate_schema_strict_unresolvable_unconfigured—--strictfix forvalidate-schemaAll 24 CLI tests pass, along with the full existing test suite (57 tests total).
🤖 Generated with Claude Code