Fix!: don't consume connected tokens when parsing quoted identifiers by georgesittas · Pull Request #5357 · SQLMesh/sqlmesh

georgesittas · 2025-09-11T19:14:32Z

This PR addresses two bugs:

In SQLMesh's override of _parse_id_var, once we parse an identifier we check whether the next token in the stream is "connected" to the identifier, i.e., there's no whitespace between the two tokens. This is done in order to support our custom macro syntax (example). Unfortunately, the condition was too lax, resulting in consuming both tokens in cases like "foo"bar, when we needed to consume only one.
There is a subtle bug, where _schema_tmp can be appended to a quoted identifier like "a"."b"."c", resulting in "a"."b"."c"_schema_tmp. Due to the parser being lax, we were somehow able to parse this.. we've succeeded by failing in this case :)

georgesittas · 2025-09-11T19:48:57Z

sqlmesh/core/macros.py

        >>> sql = "SELECT date_day, @PIVOT(status, ['cancelled', 'completed']) FROM rides GROUP BY 1"
        >>> MacroEvaluator().transform(parse_one(sql)).sql()
-        'SELECT date_day, SUM(CASE WHEN status = \\'cancelled\\' THEN 1 ELSE 0 END) AS "\\'cancelled\\'", SUM(CASE WHEN status = \\'completed\\' THEN 1 ELSE 0 END) AS "\\'completed\\'" FROM rides GROUP BY 1'
+        'SELECT date_day, SUM(CASE WHEN status = \\'cancelled\\' THEN 1 ELSE 0 END) AS "cancelled", SUM(CASE WHEN status = \\'completed\\' THEN 1 ELSE 0 END) AS "completed" FROM rides GROUP BY 1'


Not sure what led to this decision, i.e. including the SQL-representation of the value as-is in the alias.

georgesittas · 2025-09-11T19:49:15Z

tests/core/engine_adapter/test_clickhouse.py

    "test_valid_to",
    TRUE AS "_exists"
-  FROM ""__temp_target_efgh""
+  FROM "__temp_target_efgh"


This test was incorrect, the quotes here are off.

georgesittas · 2025-09-11T19:49:38Z

tests/core/test_format.py

        tmp_path,
        pathlib.Path(audits_dir, "audit_1.sql"),
-        "AUDIT(name assert_positive_id, dialect 'duckdb'); SELECT  * FROM @this_model WHERE  \"CaseSensitive\"_item_id < 0;",
+        "AUDIT(name assert_positive_id, dialect 'duckdb'); SELECT  * FROM @this_model WHERE  \"CaseSensitive_item_id\" < 0;",


This test was also incorrect, it's like trying to do SELECT * FROM t WHERE "foo"_bar < 0.

erindru

Wow, this PR highlights quite some jankiness!

georgesittas requested review from a team, izeigerman and treysp September 11, 2025 19:14

Fix: don't consume connected tokens when parsing quoted identifiers

84238cf

georgesittas force-pushed the jo/fix_identifier_parsing branch from 77ad2e0 to 84238cf Compare September 11, 2025 19:15

Fix pivot macro test

5017899

georgesittas changed the title ~~Fix: don't consume connected tokens when parsing quoted identifiers~~ Fix!: don't consume connected tokens when parsing quoted identifiers Sep 11, 2025

georgesittas commented Sep 11, 2025

View reviewed changes

erindru approved these changes Sep 11, 2025

View reviewed changes

georgesittas merged commit 0e9c8a0 into main Sep 12, 2025
36 checks passed

georgesittas deleted the jo/fix_identifier_parsing branch September 12, 2025 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix!: don't consume connected tokens when parsing quoted identifiers#5357

Fix!: don't consume connected tokens when parsing quoted identifiers#5357
georgesittas merged 2 commits intomainfrom
jo/fix_identifier_parsing

georgesittas commented Sep 11, 2025 •

edited

Loading

Uh oh!

georgesittas Sep 11, 2025

Uh oh!

georgesittas Sep 11, 2025

Uh oh!

georgesittas Sep 11, 2025

Uh oh!

erindru left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

georgesittas commented Sep 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

georgesittas Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

georgesittas Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

georgesittas Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

erindru left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

georgesittas commented Sep 11, 2025 •

edited

Loading