Skip to content

Fix case-sensitive VISUALISE column reference validation#143

Open
cpsievert wants to merge 1 commit intoposit-dev:mainfrom
cpsievert:fix/case-insensitive-visualise-columns
Open

Fix case-sensitive VISUALISE column reference validation#143
cpsievert wants to merge 1 commit intoposit-dev:mainfrom
cpsievert:fix/case-insensitive-visualise-columns

Conversation

@cpsievert
Copy link
Collaborator

@cpsievert cpsievert commented Feb 19, 2026

Summary

SQL treats unquoted identifiers as case-insensitive — SELECT REGION FROM t works even when the column is stored as region. However, ggsql's VISUALISE clause matches column references case-sensitively against the result schema. Since DuckDB lowercases unquoted identifiers in query results, writing VISUALISE REGION AS x fails when the result column is region:

aesthetic 'x' references non-existent column 'REGION'

Fix

Adds a normalize_column_names() step early in prepare_data_with_reader (before merge, validation, and query building). It uses case-insensitive matching to resolve VISUALISE column references to the actual column names in the result schema. This is reader-agnostic — it normalizes to whatever the reader returns, not specifically to lowercase.

Reproducing

CLI:

# Create a DuckDB file with lowercase-stored columns
duckdb repro.duckdb "CREATE TABLE t AS SELECT 'A' as category, 10 as value UNION ALL SELECT 'B', 20;"

# SQL resolves CATEGORY case-insensitively, but VISUALISE fails
ggsql exec --reader "duckdb://repro.duckdb" \
  "SELECT category, value FROM t VISUALISE CATEGORY AS x, VALUE AS y DRAW bar"
# => Validation error: Layer 1: aesthetic 'x' references non-existent column 'CATEGORY'

Python:

import ggsql

reader = ggsql.DuckDBReader("duckdb://memory")
reader.execute_sql(
    "CREATE TABLE t AS SELECT 'A' as category, 10 as value "
    "UNION ALL SELECT 'B', 20"
)

# SQL resolves CATEGORY case-insensitively, but VISUALISE fails
reader.execute(
    "SELECT category, value FROM t "
    "VISUALISE CATEGORY AS x, VALUE AS y DRAW bar"
)
# => Validation error: Layer 1: aesthetic 'x' references non-existent column 'CATEGORY'

Test plan

  • Added test_case_insensitive_column_references test
  • All 874 existing tests pass

@cpsievert cpsievert force-pushed the fix/case-insensitive-visualise-columns branch from c495a46 to eece4f2 Compare February 19, 2026 17:01
@cpsievert cpsievert marked this pull request as draft February 19, 2026 20:42
@cpsievert cpsievert force-pushed the fix/case-insensitive-visualise-columns branch from eece4f2 to b949f97 Compare February 19, 2026 21:29
@cpsievert cpsievert requested a review from Copilot February 19, 2026 21:43
@cpsievert cpsievert marked this pull request as ready for review February 19, 2026 21:43

This comment was marked as resolved.

The VISUALISE parser preserves the exact casing from the user's query
(e.g., `VISUALISE ROOM_TYPE AS x` stores "ROOM_TYPE"). However, DuckDB
lowercases unquoted identifiers in query results, so the schema column
is "room_type". Since ggsql quotes column names in generated SQL
(making them case-sensitive in DuckDB), this mismatch caused validation
errors like:

  "aesthetic 'x' references non-existent column 'ROOM_TYPE'"

Add a normalize_column_names() step early in the prepare_data pipeline
that resolves VISUALISE column references to match the actual schema
column names using case-insensitive matching. This is reader-agnostic:
it normalizes to whatever the reader returns, not specifically to
lowercase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cpsievert cpsievert force-pushed the fix/case-insensitive-visualise-columns branch from b949f97 to b467129 Compare February 19, 2026 22:56
Comment on lines +51 to +52
// Normalize global mappings using the first layer's schema (global mappings
// are merged into all layers, so any layer's schema suffices for normalization)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion is wrong. Each layer only takes from the global mapping what their data source and aesthetic requirements dictates. It's better to wait until global mapping has been merged into the layers and then do the normalisation only for the layers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants