Skip to content

python/tests: Add pytest-based tests#51

Merged
jplevyak merged 2 commits intojplevyak:mainfrom
JYOuyang:pytest-basic-arithmetic
Apr 16, 2026
Merged

python/tests: Add pytest-based tests#51
jplevyak merged 2 commits intojplevyak:mainfrom
JYOuyang:pytest-basic-arithmetic

Conversation

@JYOuyang
Copy link
Copy Markdown
Collaborator

@JYOuyang JYOuyang commented Mar 6, 2026

Summary

  • Adds test_basic_arithmetic.py, a pytest conversion of the existing test.py
  • Uses the modules= parameter to explicitly pass grammar actions, avoiding fragile inspect.currentframe() introspection
  • Uses pytest's tmp_path fixture to isolate generated grammar files
  • Existing test scripts are left untouched

Plan for remaining tests

Existing file Planned pytest file Coverage
test.py test_basic_arithmetic.py Done (this PR)
test2.py test_expression_eval.py Precedence, ambiguity resolution, whitespace handling
test3.py test_expression_eval.py Parenthesized expressions, nodes argument; needs pytest-forked or subprocess isolation due to SWIG cleanup SEGV
test4.py test_speculative_parsing.py spec argument, dparser.Reject, syntax_error_fn
test5.py test_speculative_parsing.py spec_only argument, speculative-only actions
test6.py test_string_replace.py s argument, getStringLeft()
test7.py test_parse_nodes.py nodes/this arguments, start_loc, buf, end, end_skip

Each file will follow the same pattern: d_* functions at module scope, a parser fixture using modules=sys.modules[__name__] and tmp_path, and standard assert statements.

Test plan

  • python -m pytest python/tests/test_basic_arithmetic.py -v passes locally

🤖 Generated with Claude Code

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request initiates the migration of existing test scripts to a pytest framework. The primary goal is to enhance test structure, improve isolation, and simplify future test development by converting the test.py script into a pytest-compatible format. This change sets a pattern for converting other legacy tests, making the testing suite more robust and maintainable.

Highlights

  • Test Conversion: Converted the existing test.py script to a new pytest-based test file, test_basic_arithmetic.py, for basic arithmetic parsing.
  • Grammar Action Handling: Implemented explicit passing of grammar actions using the modules= parameter in dparser.Parser, replacing fragile inspect.currentframe() introspection.
  • Test Isolation: Utilized pytest's tmp_path fixture to ensure generated grammar files are isolated for each test run.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • python/tests/test_basic_arithmetic.py
    • Added new pytest-based tests for basic arithmetic parsing.
Activity
  • The author confirmed that python -m pytest python/tests/test_basic_arithmetic.py -v passes locally.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces pytest-based tests, which is a significant improvement for the project's testing infrastructure. The use of pytest features like tmp_path and the explicit passing of modules are commendable. I've provided a couple of suggestions to further enhance the new test code by improving its readability and effectiveness.

Comment thread python/tests/test_basic_arithmetic.py
Comment thread python/tests/test_basic_arithmetic.py
@JYOuyang JYOuyang force-pushed the pytest-basic-arithmetic branch from 85888b7 to 138bb99 Compare April 16, 2026 20:13
@JYOuyang
Copy link
Copy Markdown
Collaborator Author

Okay. @jplevyak has conditionally approved this; James to have Claude do the needful later. 👍

@JYOuyang JYOuyang force-pushed the pytest-basic-arithmetic branch from 138bb99 to 46e1d66 Compare April 16, 2026 21:56
@JYOuyang
Copy link
Copy Markdown
Collaborator Author

Refresh: diagnosis & fix for the Python binding

A quick overview of what we landed, since the prior write-ups were spread across
a couple of documents.

The problem (primary bug)

Upstream commit bb27184 ("Fix warnings and add SAFE_RUST.md") changed the
binary encoding of D_Reduction.action_index in write_tables.c:1391 from a
flat sequential counter to a composite key:

r->prod->internal ? -1 : (r->prod->index * 10000 + r->index)

The Cython binding in python/dparser.pyx still assumed the flat encoding and
did p.actions[action_index]. With the composite encoding the index is
e.g. 10001 / 20002 for a 2-action grammar, so the guard
action_index < len(p.actions) always fails, action_tuple stays None, and
the parser falls through to returning the raw children tree.

Concretely, for the canonical test grammar:

def d_S(t):     '''S : d '+' d'''   ; return t[0] + t[2]
def d_number(t):'''d : "[0-9]+" ''' ; return int(t[0])
Parser().parse('87+5').getStructure()
  • Expected: 92
  • Observed pre-fix: [['87'], '+', ['5']] (no user actions fire at all)

Why we fixed it on the Cython side (not by reverting write_tables.c)

Reverting the encoding to r->action_index is the smaller diff, but the Rust
binding now depends on the composite prod_index * 10000 + rule_index form, so
a revert would just move the breakage. We patched the Python side instead.

The fix (commit 46e1d66)

In python/dparser.pyx:

  1. Composite-aware action resolution in my_action. Instead of indexing
    p.actions directly, we resolve the action by the nonterminal's name,
    which we read from the parser tables:

    symbol_index = pnode.reduction.symbol
    bt = <BinaryTables*>p.tables.getTables()
    symbol_name_c = bt.parser_tables_gram.symbols[symbol_index].name
    nonterm_name = symbol_name_c.decode('utf-8')
    pending = p._pending.get(nonterm_name)
    if pending:
        action_tuple = pending.pop(0)
        p._action_cache[action_index] = action_tuple

    Parser.__init__ now maintains self._pending: dict[str, list[tuple]]
    keyed by nonterminal name (populated in declaration order from the
    d_<name> docstrings) and self._action_cache: dict[int, tuple] keyed by
    the composite action_index. On first encounter of a given composite
    index, we pop the next pending action for that nonterm and cache it;
    subsequent reductions for the same rule hit the cache.

  2. Secondary gate fix. The speculative/final gate was inverted:

    # before — ran default actions in the speculative pass only
    if takes_speculative == -1 and not speculative:
        return 0
    # after — `-1` means final-only, so skip when we *are* speculative
    if takes_speculative == -1 and speculative:
        return 0

    This was latent while the primary bug kept us from ever reaching the gate;
    it became load-bearing once the resolver started returning real tuples.

Regression test (test_speculative_gate.py)

The primary bug is already covered by test_basic_arithmetic.py (a wrong
action_index lookup means the parse result is the raw tree, not 92).

The secondary gate is trickier to observe because the inversion still returns
the correct numeric answer — actions just fire on the wrong pass. The new
test uses the spec_only argument binding (which hands the action the
speculative flag and is governed by the same gate) to directly assert that
default actions only fire in the final pass:

def d_S(t, spec_only):
    '''S : d '+' d'''
    _calls.append(('S', spec_only))
    return t[0] + t[2]

def test_default_actions_run_in_final_pass_only(parser):
    assert parser.parse('87+5').getStructure() == 92
    assert all(flag == 0 for _, flag in _calls), \
        f'expected every action to fire in the final pass (spec=0), got: {_calls}'

Against the buggy gate, this fails with got: [('d', 1), ('d', 1), ('S', 1)]
(all speculative). With the fix, every call has flag == 0.

Verification

cd python
rm -f tests/*.dat tests/*.md5
uv pip install -e .
uv run pytest tests/

All four tests pass: test_addition, test_addition_with_skip_space,
test_partial_parse_with_offset_and_skip, test_default_actions_run_in_final_pass_only.

@JYOuyang
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the action resolution logic in dparser.pyx to handle composite action indices using a new caching and pending action mechanism. It also fixes a bug where non-speculative actions were incorrectly skipped during the final pass and introduces regression tests for basic arithmetic and the speculative gate. Feedback focuses on the potential for incorrect action mapping due to the non-deterministic order of reductions when using pop(0), performance overhead in the cache for rules without actions, and the fragility of extracting nonterminal names from docstrings.

Comment thread python/dparser.pyx Outdated
Comment on lines +443 to +446
pending = p._pending.get(nonterm_name)
if pending:
action_tuple = pending.pop(0)
p._action_cache[action_index] = action_tuple
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The logic for resolving actions using pending.pop(0) is non-deterministic and potentially incorrect. It assumes that the order in which rules are first encountered during parsing matches the order in which functions are declared in the Python module. However, the order of reductions is determined by the input string and the parser's state machine, not the grammar declaration order. If a nonterminal has multiple rules and they are encountered out of order, the wrong Python action will be associated with the rule and cached. A more robust mapping based on the actual rule index within the production (e.g., using action_index % 10000) should be established during parser initialization.

Comment thread python/dparser.pyx Outdated
Comment on lines +428 to +431
cached = p._action_cache.get(action_index)
if cached is not None:
action_tuple = <tuple>cached
if action_tuple is None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The cache lookup does not distinguish between a missing key and a key associated with None. If a rule has no associated Python action, action_tuple remains None, and the resolution logic (including string decoding and dictionary lookups) will be re-executed every time that action_index is encountered. Consider using a sentinel value or checking for key existence in the cache to improve performance for rules without actions.

Comment thread python/dparser.pyx Outdated
self.actions.append((f, arg_types, speculative))
action_tuple = (f, arg_types, speculative)
self.actions.append(action_tuple)
nonterm_name = f.__doc__.strip().split(':', 1)[0].strip()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Extracting the nonterminal name by splitting the docstring on the first colon is fragile. It may fail or produce incorrect results if the docstring contains comments, leading whitespace, or multiple rules before the first colon. A more robust parsing of the docstring or an explicit way to associate functions with nonterminals would be more reliable.

JYOuyang and others added 2 commits April 16, 2026 16:05
Converts test.py to pytest using explicit `modules=` parameter
to avoid frame introspection, `tmp_path` for generated grammar
files, and standard assertions.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@JYOuyang JYOuyang force-pushed the pytest-basic-arithmetic branch from 46e1d66 to 756c658 Compare April 16, 2026 23:07
@jplevyak jplevyak merged commit ff0f0fb into jplevyak:main Apr 16, 2026
@JYOuyang JYOuyang deleted the pytest-basic-arithmetic branch April 16, 2026 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants