Skip to content

bug: Incorrectly parses typeid and C++-style casts. #348

@ShimmerFairy

Description

@ShimmerFairy

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-cpp

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

No response

Describe the bug

In C++ there are a few alternatives of the postfix-expression rule which look like function calls, but are actually built-in expressions. These are casts like dynamic_cast as well as the typeid operator.

Currently, with tree-sitter-cpp version 0.23.4, these all parse as call_expressions, with the cast operations in particular parsed as template function invocations. The parenthesized expressions should be parsed by C++'s expression rule (with typeid additionally allowing the type-id rule), but because of this bug are misparsed as an optional expression-list. This most significantly affects typeid used with type-id, since it's likely those won't coincidentally look like function arguments (see below for an example of this).

This is a serious issue because it will cause users to misinterpret these expressions as ordinary function calls, which can affect highlighting and indentation. (In current emacs for example, typeid gets mis-highlighted as a function call, because tree-sitter says it's one.) While at first glance parsing them as function calls may seem like ultimately no big deal (any editor would likely need hardcoded checks for these special expressions regardless), the misparse does still cause problems for anything that needs to understand these expressions:

  • typeid will very quickly fail to parse if it contains a type-id that doesn't coincidentally look like a plausible function argument. Thus things like typeid(const int) or typeid(double) will fail.
  • A subtle issue lies with comma expressions like in reinterpret_cast<double>(a, b). That a, b should be parsed as a single expression, but since the whole thing gets misinterpreted as a function call, it's instead parsed by tree-sitter as an expression-list, which ultimately means it gets parsed as two assignment-expressions separated by a comma.
  • Finally, this bug can sometimes accept expressions that it shouldn't. The expression-list rule for function calls allows braced initializers and parameter packs, which means things like typeid({42}) or typeid(*c++...) are allowed when they shouldn't be.

Steps To Reproduce/Bad Parse Tree

typeid(unsigned long long) parses as:

(call_expression function: (identifier)
 arguments: 
  (argument_list ( (identifier)
   (ERROR (identifier) long)
   )))

dynamic_cast<derived *>(foo, bar) parses as:

(call_expression
 function: 
  (template_function name: (identifier)
   arguments: 
    (template_argument_list <
     (type_descriptor type: (type_identifier)
      declarator: (abstract_pointer_declarator *))
     >))
 arguments: (argument_list ( (identifier) , (identifier) )))

static_cast<int>(y...) parses as:

(call_expression
 function: 
  (template_function name: (identifier)
   arguments: 
    (template_argument_list <
     (type_descriptor type: (primitive_type))
     >))
 arguments: 
  (argument_list (
   (parameter_pack_expansion pattern: (identifier) ...)
   )))

(Trees are as represented by emacs's treesit explorer.)

Expected Behavior/Parse Tree

For these typeid operations and C++-style casts to be parsed as something other than call_expressions, so that they can't get misinterpreted as ordinary function calls. In addition to the issue with parsing the above examples as call_expressions, each example highlights an additional issue:

  • typeid(unsigned long long) should be valid, instead of causing an error.
  • dynamic_cast<derived *>(foo, bar) should parse foo, bar as a single expression, not as a list of two arguments.
  • static_cast<int>(y...) should be rejected, since y... is not a valid expression.

Repro

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions