-
-
Notifications
You must be signed in to change notification settings - Fork 156
Description
Did you check existing issues?
- I have read all the tree-sitter docs if it relates to using the parser
- I have searched the existing issues of tree-sitter-cpp
Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)
tree-sitter-cpp 0.23.4, tree-sitter 0.25.2 (Python bindings)
Describe the bug
Error recovery can destroy a function_definition node entirely depending on the ordering of statements that follow a misparsed template method call.
This is related to #346 (template method calls parsed as comparison operators), but is a distinct bug: #346 reports that obj.method<T>(args) produces a wrong-but-complete AST (a binary_expression instead of a template call). This issue is about tree-sitter's error recovery producing inconsistent results — the same misparsed expression can either preserve or destroy the enclosing function_definition depending on what statements follow it.
Steps To Reproduce/Bad Parse Tree
The following two files are identical valid C++ (both compile with g++ -c) and differ only in whether g = 0; precedes or follows using R2 = int;:
bug_yes.cpp — function_definition is lost:
struct C {};
template<typename T> struct S {
template<typename U> S& m(U) { return *this; }
void a(int) {}
};
int g;
void foo(int, int) {
S<void(C*)> s;
using R = int;
s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
g = 0;
using R2 = int;
}bug_no.cpp — function_definition is preserved:
struct C {};
template<typename T> struct S {
template<typename U> S& m(U) { return *this; }
void a(int) {}
};
int g;
void foo(int, int) {
S<void(C*)> s;
using R = int;
s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
using R2 = int;
g = 0;
}In bug_yes.cpp, the top-level children include primitive_type, function_declarator, {, and } as separate fragments — there is no function_definition node. In bug_no.cpp, the entire function is wrapped in a single function_definition [6,0]-[12,1] node with a compound_statement body.
You can verify this with the following Python script:
import tree_sitter_cpp as tscpp
import tree_sitter
import sys
language = tree_sitter.Language(tscpp.language())
parser = tree_sitter.Parser(language)
for f in sys.argv[1:]:
with open(f) as fh:
tree = parser.parse(bytes(fh.read(), "utf-8"))
func_defs = [c for c in tree.root_node.children if c.type == "function_definition"]
has_error = tree.root_node.has_error
print(f"{f}: function_definitions={len(func_defs)}, has_error={has_error}")Output:
bug_yes.cpp: function_definitions=0, has_error=True
bug_no.cpp: function_definitions=1, has_error=True
Expected Behavior/Parse Tree
Both files should produce a function_definition node for foo. Both have errors (the template call is misparsed in both cases), but the error recovery should not destroy the enclosing function definition based on the ordering of unrelated statements within the body.
Repro
struct C {};
template<typename T> struct S {
template<typename U> S& m(U) { return *this; }
void a(int) {}
};
int g;
void foo(int, int) {
S<void(C*)> s;
using R = int;
s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
g = 0;
using R2 = int;
}