Out-of-class = default; special-member definitions silently lose the function_definition shape on master (commit 8b5b49e). = delete; is unaffected.
Repro
Foo::~Foo() = default; // BUG — see AST below
Foo::Foo() = default; // BUG — same shape
Foo::~Foo() = delete; // OK — still function_definition
Foo::Foo() = delete; // OK — still function_definition
Bad parse for Foo::~Foo() = default;:
translation_unit
expression_statement
assignment_expression
call_expression (left)
qualified_identifier (function)
namespace_identifier (scope) "Foo"
destructor_name (name)
identifier "Foo"
argument_list (arguments)
"=" (operator)
identifier (right) "default" ;; ← `default` lexed as identifier
Expected (and what 0.23.4 / pre-regen master produce):
translation_unit
function_definition (aliased from constructor_or_destructor_definition)
function_declarator (declarator)
qualified_identifier (declarator)
namespace_identifier (scope) "Foo"
destructor_name (name)
identifier "Foo"
parameter_list (parameters)
default_method_clause
Bisect
grammar.js is identical between the last good and first bad commit. Only the regenerated artefacts differ.
| commit |
tree-sitter-cli |
parses = default; as |
| cacfb40 (2025-07-06) |
0.25.6 |
function_definition ✓ |
| 12bd6f7 (2025-09-19) |
0.25.9 |
expression_statement ✗ |
The cli bump in 1832dd7 (0.25.6 → 0.25.9) is the only intervening change that touches generator behaviour.
Likely root cause
tree-sitter v0.25.7 shipped PR #4586 (`fix(generate): use topological sort for subtype map`) which changed the visit order used when building the LR conflict-resolution table. At the ambiguity point for Foo::~Foo() = default;:
- candidate A:
constructor_or_destructor_definition + default_method_clause
- candidate B:
expression_statement > assignment_expression > call_expression (the Foo::~Foo() call assigned to identifier default)
Old visit order picked A. Topologically sorted order picks B.
= delete; survives because delete is a keyword token in tree-sitter-cpp and can't slot into the right: of assignment_expression. default lexes as (identifier) in expression position (e.g. for switch labels), so the alternative reduction stays viable.
PR #4586 itself is a legitimate cli fix — the grammar is what needs a precedence hint.
Suggested fix
prec.dynamic on the default_method_clause branch should be enough to force candidate A back to winning:
constructor_or_destructor_definition: $ => seq(
repeat($._constructor_specifiers),
field('declarator', $.function_declarator),
choice(
seq(
optional($.field_initializer_list),
field('body', $.compound_statement),
),
alias($.constructor_try_statement, $.try_statement),
- $.default_method_clause,
+ prec.dynamic(1, $.default_method_clause),
$.delete_method_clause,
$.pure_virtual_clause,
),
),
Untested — happy to verify against real corpora (nlohmann/json, Fuzzer, LLVM) if a candidate patch lands.
Workaround for downstream consumers stuck on master
If you're indexing C++ with tree-sitter-cpp ABI 15 and need to recover the lost destructor/constructor symbols, this query pattern catches the regressed shape:
(expression_statement
(assignment_expression
left: (call_expression
function: (qualified_identifier
name: [
(destructor_name) @name.method
(identifier) @name.method
]))
right: (identifier) @_default
(#eq? @_default "default"))) @method
False-positive risk is essentially zero — the only inputs that match are exactly the regressed AST shape, and real C++ never has a non-special-member `expr() = default;`.
Out-of-class
= default;special-member definitions silently lose thefunction_definitionshape onmaster(commit 8b5b49e).= delete;is unaffected.Repro
Bad parse for
Foo::~Foo() = default;:Expected (and what 0.23.4 / pre-regen
masterproduce):Bisect
grammar.jsis identical between the last good and first bad commit. Only the regenerated artefacts differ.= default;asfunction_definition✓expression_statement✗The cli bump in 1832dd7 (0.25.6 → 0.25.9) is the only intervening change that touches generator behaviour.
Likely root cause
tree-sitterv0.25.7 shipped PR #4586 (`fix(generate): use topological sort for subtype map`) which changed the visit order used when building the LR conflict-resolution table. At the ambiguity point forFoo::~Foo() = default;:constructor_or_destructor_definition+default_method_clauseexpression_statement > assignment_expression > call_expression(theFoo::~Foo()call assigned to identifierdefault)Old visit order picked A. Topologically sorted order picks B.
= delete;survives becausedeleteis a keyword token in tree-sitter-cpp and can't slot into theright:ofassignment_expression.defaultlexes as(identifier)in expression position (e.g. for switch labels), so the alternative reduction stays viable.PR #4586 itself is a legitimate cli fix — the grammar is what needs a precedence hint.
Suggested fix
prec.dynamicon thedefault_method_clausebranch should be enough to force candidate A back to winning:constructor_or_destructor_definition: $ => seq( repeat($._constructor_specifiers), field('declarator', $.function_declarator), choice( seq( optional($.field_initializer_list), field('body', $.compound_statement), ), alias($.constructor_try_statement, $.try_statement), - $.default_method_clause, + prec.dynamic(1, $.default_method_clause), $.delete_method_clause, $.pure_virtual_clause, ), ),Untested — happy to verify against real corpora (nlohmann/json, Fuzzer, LLVM) if a candidate patch lands.
Workaround for downstream consumers stuck on
masterIf you're indexing C++ with tree-sitter-cpp ABI 15 and need to recover the lost destructor/constructor symbols, this query pattern catches the regressed shape:
(expression_statement (assignment_expression left: (call_expression function: (qualified_identifier name: [ (destructor_name) @name.method (identifier) @name.method ])) right: (identifier) @_default (#eq? @_default "default"))) @methodFalse-positive risk is essentially zero — the only inputs that match are exactly the regressed AST shape, and real C++ never has a non-special-member `expr() = default;`.