Context
schemata v0.3.2 introduced 9 new regex patterns that fall back to regex in the classifier. 2 are already covered by #38 (hex_alternation, base64_2pad). The remaining 7 need new native ops.
Unclassified patterns
From the classifier-gate CI failure on #39:
| # |
Pattern |
Proposed op |
Description |
| 1 |
^[a-z][a-z0-9]*$ |
identifier |
lowercase identifier |
| 2 |
^[A-Z][a-zA-Z0-9]*$ |
identifier |
PascalCase identifier |
| 3 |
^[a-z][a-z0-9-]*$ |
identifier |
lowercase kebab identifier |
| 4 |
^!?[a-z][a-z0-9]*$ |
identifier |
optional ! prefix + lowercase |
| 5 |
^!?[0-9]+$ |
optional_prefix_digits |
optional ! prefix + digits |
| 6 |
^[a-z_]+( [a-z_]+)*$ |
space_separated_charset |
space-separated lowercase+underscore tokens |
| 7 |
^[A-Za-z][A-Za-z0-9+.-]*:// |
uri_scheme |
URI scheme prefix (no end anchor) |
Approach
Patterns 1-4 share a common shape: ^[optional_prefix][first_char_class][rest_char_class]*$. A single identifier op with configurable fields could cover all four.
Pattern 5 is similar but digit-only body.
Pattern 6 extends the existing space_separated_tokens concept.
Pattern 7 is a URI scheme prefix check (unanchored end).
Blocked by
Related
Context
schemata v0.3.2 introduced 9 new regex patterns that fall back to
regexin the classifier. 2 are already covered by #38 (hex_alternation,base64_2pad). The remaining 7 need new native ops.Unclassified patterns
From the classifier-gate CI failure on #39:
^[a-z][a-z0-9]*$identifier^[A-Z][a-zA-Z0-9]*$identifier^[a-z][a-z0-9-]*$identifier^!?[a-z][a-z0-9]*$identifier!prefix + lowercase^!?[0-9]+$optional_prefix_digits!prefix + digits^[a-z_]+( [a-z_]+)*$space_separated_charset^[A-Za-z][A-Za-z0-9+.-]*://uri_schemeApproach
Patterns 1-4 share a common shape:
^[optional_prefix][first_char_class][rest_char_class]*$. A singleidentifierop with configurable fields could cover all four.Pattern 5 is similar but digit-only body.
Pattern 6 extends the existing
space_separated_tokensconcept.Pattern 7 is a URI scheme prefix check (unanchored end).
Blocked by
Related