Skip to content

bug: error recovery destroys function_definition node depending on statement ordering #352

@aytey

Description

@aytey

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-cpp

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

tree-sitter-cpp 0.23.4, tree-sitter 0.25.2 (Python bindings)

Describe the bug

Error recovery can destroy a function_definition node entirely depending on the ordering of statements that follow a misparsed template method call.

This is related to #346 (template method calls parsed as comparison operators), but is a distinct bug: #346 reports that obj.method<T>(args) produces a wrong-but-complete AST (a binary_expression instead of a template call). This issue is about tree-sitter's error recovery producing inconsistent results — the same misparsed expression can either preserve or destroy the enclosing function_definition depending on what statements follow it.

Steps To Reproduce/Bad Parse Tree

The following two files are identical valid C++ (both compile with g++ -c) and differ only in whether g = 0; precedes or follows using R2 = int;:

bug_yes.cppfunction_definition is lost:

struct C {};
template<typename T> struct S {
    template<typename U> S& m(U) { return *this; }
    void a(int) {}
};
int g;
void foo(int, int) {
    S<void(C*)> s;
    using R = int;
    s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
    g = 0;
    using R2 = int;
}

bug_no.cppfunction_definition is preserved:

struct C {};
template<typename T> struct S {
    template<typename U> S& m(U) { return *this; }
    void a(int) {}
};
int g;
void foo(int, int) {
    S<void(C*)> s;
    using R = int;
    s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
    using R2 = int;
    g = 0;
}

In bug_yes.cpp, the top-level children include primitive_type, function_declarator, {, and } as separate fragments — there is no function_definition node. In bug_no.cpp, the entire function is wrapped in a single function_definition [6,0]-[12,1] node with a compound_statement body.

You can verify this with the following Python script:

import tree_sitter_cpp as tscpp
import tree_sitter
import sys

language = tree_sitter.Language(tscpp.language())
parser = tree_sitter.Parser(language)

for f in sys.argv[1:]:
    with open(f) as fh:
        tree = parser.parse(bytes(fh.read(), "utf-8"))
    func_defs = [c for c in tree.root_node.children if c.type == "function_definition"]
    has_error = tree.root_node.has_error
    print(f"{f}: function_definitions={len(func_defs)}, has_error={has_error}")

Output:

bug_yes.cpp: function_definitions=0, has_error=True
bug_no.cpp: function_definitions=1, has_error=True

Expected Behavior/Parse Tree

Both files should produce a function_definition node for foo. Both have errors (the template call is misparsed in both cases), but the error recovery should not destroy the enclosing function definition based on the ordering of unrelated statements within the body.

Repro

struct C {};
template<typename T> struct S {
    template<typename U> S& m(U) { return *this; }
    void a(int) {}
};
int g;
void foo(int, int) {
    S<void(C*)> s;
    using R = int;
    s.m <R (S<void(C*)>::*)(C*, bool)> ((R (S<void(C*)>::*)(C*, bool))0).a(0);
    g = 0;
    using R2 = int;
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions