python/tests: Add pytest-based tests by JYOuyang · Pull Request #51 · jplevyak/dparser

JYOuyang · 2026-03-06T23:17:53Z

Summary

Adds test_basic_arithmetic.py, a pytest conversion of the existing test.py
Uses the modules= parameter to explicitly pass grammar actions, avoiding fragile inspect.currentframe() introspection
Uses pytest's tmp_path fixture to isolate generated grammar files
Existing test scripts are left untouched

Plan for remaining tests

Existing file	Planned pytest file	Coverage
`test.py`	`test_basic_arithmetic.py`	Done (this PR)
`test2.py`	`test_expression_eval.py`	Precedence, ambiguity resolution, whitespace handling
`test3.py`	`test_expression_eval.py`	Parenthesized expressions, `nodes` argument; needs `pytest-forked` or subprocess isolation due to SWIG cleanup SEGV
`test4.py`	`test_speculative_parsing.py`	`spec` argument, `dparser.Reject`, `syntax_error_fn`
`test5.py`	`test_speculative_parsing.py`	`spec_only` argument, speculative-only actions
`test6.py`	`test_string_replace.py`	`s` argument, `getStringLeft()`
`test7.py`	`test_parse_nodes.py`	`nodes`/`this` arguments, `start_loc`, `buf`, `end`, `end_skip`

Each file will follow the same pattern: d_* functions at module scope, a parser fixture using modules=sys.modules[__name__] and tmp_path, and standard assert statements.

Test plan

python -m pytest python/tests/test_basic_arithmetic.py -v passes locally

🤖 Generated with Claude Code

gemini-code-assist · 2026-03-06T23:18:54Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request initiates the migration of existing test scripts to a pytest framework. The primary goal is to enhance test structure, improve isolation, and simplify future test development by converting the test.py script into a pytest-compatible format. This change sets a pattern for converting other legacy tests, making the testing suite more robust and maintainable.

Highlights

Test Conversion: Converted the existing test.py script to a new pytest-based test file, test_basic_arithmetic.py, for basic arithmetic parsing.
Grammar Action Handling: Implemented explicit passing of grammar actions using the modules= parameter in dparser.Parser, replacing fragile inspect.currentframe() introspection.
Test Isolation: Utilized pytest's tmp_path fixture to ensure generated grammar files are isolated for each test run.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

python/tests/test_basic_arithmetic.py
- Added new pytest-based tests for basic arithmetic parsing.

Activity

The author confirmed that python -m pytest python/tests/test_basic_arithmetic.py -v passes locally.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces pytest-based tests, which is a significant improvement for the project's testing infrastructure. The use of pytest features like tmp_path and the explicit passing of modules are commendable. I've provided a couple of suggestions to further enhance the new test code by improving its readability and effectiveness.

JYOuyang · 2026-04-16T20:14:56Z

Okay. @jplevyak has conditionally approved this; James to have Claude do the needful later. 👍

JYOuyang · 2026-04-16T21:59:59Z

Refresh: diagnosis & fix for the Python binding

A quick overview of what we landed, since the prior write-ups were spread across
a couple of documents.

The problem (primary bug)

Upstream commit bb27184 ("Fix warnings and add SAFE_RUST.md") changed the
binary encoding of D_Reduction.action_index in write_tables.c:1391 from a
flat sequential counter to a composite key:

r->prod->internal ? -1 : (r->prod->index * 10000 + r->index)

The Cython binding in python/dparser.pyx still assumed the flat encoding and
did p.actions[action_index]. With the composite encoding the index is
e.g. 10001 / 20002 for a 2-action grammar, so the guard
action_index < len(p.actions) always fails, action_tuple stays None, and
the parser falls through to returning the raw children tree.

Concretely, for the canonical test grammar:

def d_S(t):     '''S : d '+' d'''   ; return t[0] + t[2]
def d_number(t):'''d : "[0-9]+" ''' ; return int(t[0])
Parser().parse('87+5').getStructure()

Expected: 92
Observed pre-fix: [['87'], '+', ['5']] (no user actions fire at all)

Why we fixed it on the Cython side (not by reverting `write_tables.c`)

Reverting the encoding to r->action_index is the smaller diff, but the Rust
binding now depends on the composite prod_index * 10000 + rule_index form, so
a revert would just move the breakage. We patched the Python side instead.

The fix (commit `46e1d66`)

In python/dparser.pyx:

Composite-aware action resolution in my_action. Instead of indexing
p.actions directly, we resolve the action by the nonterminal's name,
which we read from the parser tables:
```
symbol_index = pnode.reduction.symbol
bt = <BinaryTables*>p.tables.getTables()
symbol_name_c = bt.parser_tables_gram.symbols[symbol_index].name
nonterm_name = symbol_name_c.decode('utf-8')
pending = p._pending.get(nonterm_name)
if pending:
    action_tuple = pending.pop(0)
    p._action_cache[action_index] = action_tuple
```
Parser.__init__ now maintains self._pending: dict[str, list[tuple]]
keyed by nonterminal name (populated in declaration order from the
d_<name> docstrings) and self._action_cache: dict[int, tuple] keyed by
the composite action_index. On first encounter of a given composite
index, we pop the next pending action for that nonterm and cache it;
subsequent reductions for the same rule hit the cache.

Secondary gate fix. The speculative/final gate was inverted:

# before — ran default actions in the speculative pass only
if takes_speculative == -1 and not speculative:
    return 0
# after — `-1` means final-only, so skip when we *are* speculative
if takes_speculative == -1 and speculative:
    return 0

This was latent while the primary bug kept us from ever reaching the gate;
it became load-bearing once the resolver started returning real tuples.

Regression test (`test_speculative_gate.py`)

The primary bug is already covered by test_basic_arithmetic.py (a wrong
action_index lookup means the parse result is the raw tree, not 92).

The secondary gate is trickier to observe because the inversion still returns
the correct numeric answer — actions just fire on the wrong pass. The new
test uses the spec_only argument binding (which hands the action the
speculative flag and is governed by the same gate) to directly assert that
default actions only fire in the final pass:

def d_S(t, spec_only):
    '''S : d '+' d'''
    _calls.append(('S', spec_only))
    return t[0] + t[2]

def test_default_actions_run_in_final_pass_only(parser):
    assert parser.parse('87+5').getStructure() == 92
    assert all(flag == 0 for _, flag in _calls), \
        f'expected every action to fire in the final pass (spec=0), got: {_calls}'

Against the buggy gate, this fails with got: [('d', 1), ('d', 1), ('S', 1)]
(all speculative). With the fix, every call has flag == 0.

Verification

cd python
rm -f tests/*.dat tests/*.md5
uv pip install -e .
uv run pytest tests/

All four tests pass: test_addition, test_addition_with_skip_space,
test_partial_parse_with_offset_and_skip, test_default_actions_run_in_final_pass_only.

JYOuyang · 2026-04-16T22:03:44Z

/gemini review

gemini-code-assist

Code Review

This pull request updates the action resolution logic in dparser.pyx to handle composite action indices using a new caching and pending action mechanism. It also fixes a bug where non-speculative actions were incorrectly skipped during the final pass and introduces regression tests for basic arithmetic and the speculative gate. Feedback focuses on the potential for incorrect action mapping due to the non-deterministic order of reductions when using pop(0), performance overhead in the cache for rules without actions, and the fragility of extracting nonterminal names from docstrings.

gemini-code-assist · 2026-04-16T22:05:49Z

+                        pending = p._pending.get(nonterm_name)
+                        if pending:
+                            action_tuple = pending.pop(0)
+                            p._action_cache[action_index] = action_tuple


The logic for resolving actions using pending.pop(0) is non-deterministic and potentially incorrect. It assumes that the order in which rules are first encountered during parsing matches the order in which functions are declared in the Python module. However, the order of reductions is determined by the input string and the parser's state machine, not the grammar declaration order. If a nonterminal has multiple rules and they are encountered out of order, the wrong Python action will be associated with the rule and cached. A more robust mapping based on the actual rule index within the production (e.g., using action_index % 10000) should be established during parser initialization.

gemini-code-assist · 2026-04-16T22:05:49Z

+        cached = p._action_cache.get(action_index)
+        if cached is not None:
+            action_tuple = <tuple>cached
+        if action_tuple is None:


The cache lookup does not distinguish between a missing key and a key associated with None. If a rule has no associated Python action, action_tuple remains None, and the resolution logic (including string decoding and dictionary lookups) will be re-executed every time that action_index is encountered. Consider using a sentinel value or checking for key existence in the cache to improve performance for rules without actions.

gemini-code-assist · 2026-04-16T22:05:49Z

-            self.actions.append((f, arg_types, speculative))
+            action_tuple = (f, arg_types, speculative)
+            self.actions.append(action_tuple)
+            nonterm_name = f.__doc__.strip().split(':', 1)[0].strip()


Extracting the nonterminal name by splitting the docstring on the first colon is fragile. It may fail or produce incorrect results if the docstring contains comments, leading whitespace, or multiple rules before the first colon. A more robust parsing of the docstring or an explicit way to associate functions with nonterminals would be more reliable.

Converts test.py to pytest using explicit `modules=` parameter to avoid frame introspection, `tmp_path` for generated grammar files, and standard assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gemini-code-assist Bot reviewed Mar 6, 2026

View reviewed changes

Comment thread python/tests/test_basic_arithmetic.py

Comment thread python/tests/test_basic_arithmetic.py

JYOuyang force-pushed the pytest-basic-arithmetic branch from 85888b7 to 138bb99 Compare April 16, 2026 20:13

JYOuyang force-pushed the pytest-basic-arithmetic branch from 138bb99 to 46e1d66 Compare April 16, 2026 21:56

gemini-code-assist Bot reviewed Apr 16, 2026

View reviewed changes

JYOuyang and others added 2 commits April 16, 2026 16:05

python/tests: Add pytest version of basic arithmetic test

3c787e8

Converts test.py to pytest using explicit `modules=` parameter to avoid frame introspection, `tmp_path` for generated grammar files, and standard assertions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

python/tests: Add test_speculative_gate.py

756c658

JYOuyang force-pushed the pytest-basic-arithmetic branch from 46e1d66 to 756c658 Compare April 16, 2026 23:07

jplevyak merged commit ff0f0fb into jplevyak:main Apr 16, 2026

JYOuyang deleted the pytest-basic-arithmetic branch April 16, 2026 23:20

JYOuyang mentioned this pull request Apr 16, 2026

python/tests: Convert remaining legacy test scripts to pytest #59

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python/tests: Add pytest-based tests#51

python/tests: Add pytest-based tests#51
jplevyak merged 2 commits intojplevyak:mainfrom
JYOuyang:pytest-basic-arithmetic

JYOuyang commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

JYOuyang commented Apr 16, 2026

Uh oh!

JYOuyang commented Apr 16, 2026

Uh oh!

JYOuyang commented Apr 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JYOuyang commented Mar 6, 2026

Summary

Plan for remaining tests

Test plan

Uh oh!

gemini-code-assist Bot commented Mar 6, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

JYOuyang commented Apr 16, 2026

Uh oh!

JYOuyang commented Apr 16, 2026

Refresh: diagnosis & fix for the Python binding

The problem (primary bug)

Why we fixed it on the Cython side (not by reverting write_tables.c)

The fix (commit 46e1d66)

Regression test (test_speculative_gate.py)

Verification

Uh oh!

JYOuyang commented Apr 16, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Why we fixed it on the Cython side (not by reverting `write_tables.c`)

The fix (commit `46e1d66`)

Regression test (`test_speculative_gate.py`)