Skip to content

Update validation for graphml parser + tests#151

Open
Brqndon1 wants to merge 2 commits into
lab-v2:pyreason_4_changesfrom
Brqndon1:graphml_parser_validation
Open

Update validation for graphml parser + tests#151
Brqndon1 wants to merge 2 commits into
lab-v2:pyreason_4_changesfrom
Brqndon1:graphml_parser_validation

Conversation

@Brqndon1
Copy link
Copy Markdown
Contributor

@Brqndon1 Brqndon1 commented May 8, 2026

Summary

Added input validation to parse_graph_attributes in GraphmlParser, mirroring the validation style of fact_parser.py but emitting warnings.warn(...) and skipping the offending attribute (or whole node/edge) instead of raising.

Validation rules implemented

  1. Node/edge ID validation against _COMPONENT_RE. One warning per bad node/edge, all attributes on it skipped (not one warning per attribute).
  2. Attribute key validation against _PREDICATE_RE.
  3. Bound parsing — fixed the int(...)float(...) bug in the comma-separated interval string parser. Added range and order checks (0 <= lower <= 1, 0 <= upper <= 1, lower <= upper).
  4. Single-numeric out-of-range values (and out-of-range numeric strings like "1.5") now warn-and-skip instead of silently falling through to a categorical label.
  5. Composed key-value labels are validated against _PREDICATE_RE — values containing spaces, commas, parens etc. now warn-and-skip rather than producing malformed labels.
  6. Empty / whitespace-only keys and values are rejected.

Tests

New file tests/unit/dont_disable_jit/test_graphml_parser.py mirroring the structure of test_fact_parser.py:

  • TestValidGraphmlParsing (10 tests) — numeric in range, single-string numeric, interval string (regression), edge with numeric, edge with interval, categorical labels, boundary values 0 / 1 / 0.0 / 1.0, multiple valid attributes, static_facts=True propagation.
  • TestInvalidGraphmlParsing (12 tests) — every bullet from the task spec's "Invalid" list: bad keys, empty keys, out-of-range numerics (string and type), bad node/edge IDs, all four interval-string failure modes, composed-label regex failures, whitespace values.
  • TestEdgeCasesConditions (4 tests) — empty graph, node with no attributes, label accumulation across multiple nodes, mixed-case keys preserved.

@ColtonPayne ColtonPayne changed the base branch from main to pyreason_4_changes May 11, 2026 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants