Summary
Follow-up from the same NYC TLC Yellow Taxi exercise as StefanSteiner#1, this time a head-to-head benchmark of hyperdb-mcp vs DuckDB over the full normalized dataset (299,214,146 rows, 34 Parquet files, Jan + Jul × 2009–2025). hyperdb performed very well — it won 8 of 10 compute queries against DuckDB's own pre-loaded native table, by 1.25–3.9× — but the suite surfaced one dialect blocker and one perf papercut worth filing.
Environment: hyperdb-mcp (this repo), driven from Claude Code. Dialect is Salesforce Data Cloud SQL / Tableau Hyper. Data was warm in the daemon (one bench table). Timings are engine-side stats.elapsed_ms.
1. (Dialect gap) QUALIFY is not supported — 42601 syntax error
A standard "top-N per group" query using QUALIFY on a window function fails:
WITH z AS (
SELECT data_year, "PULocationID", COUNT(*) trips
FROM bench WHERE "PULocationID" IS NOT NULL
GROUP BY data_year, "PULocationID"
)
SELECT data_year, "PULocationID", trips,
ROW_NUMBER() OVER (PARTITION BY data_year ORDER BY trips DESC) rnk
FROM z
QUALIFY rnk <= 5;
-- server error (42601): ERROR: syntax error: got identifier, expected end-of-file
The documented subquery fallback works fine:
SELECT * FROM (
WITH z AS (...) SELECT ..., ROW_NUMBER() OVER (...) rnk FROM z
) s
WHERE rnk <= 5;
Impact: Minor functionally (the fallback is mechanical), but QUALIFY is a very common idiom and is supported by DuckDB, Snowflake, BigQuery, and Databricks. The MCP server instructions / dialect card advertise extensive window-function support (row_number, rank, modified_rank, IGNORE NULLS, frame modes, etc.) but do not flag QUALIFY as absent, so a user reasonably expects it to work.
Suggestion: Support QUALIFY, or explicitly note its absence (and the subquery-wrap workaround) in the dialect card alongside the window-function list.
2. (Perf) APPROX_COUNT_DISTINCT gives almost no speedup when the argument is a constructed string
Exact vs approximate distinct over a 299M-row concatenated key were essentially the same:
SELECT COUNT(DISTINCT "PULocationID" || '-' || "DOLocationID") FROM bench;
-- ~2.31 s (median of 3)
SELECT APPROX_COUNT_DISTINCT("PULocationID" || '-' || "DOLocationID") FROM bench;
-- ~2.10 s (median of 3) — only ~9% faster
For comparison, on DuckDB the approximate path is dramatically cheaper than exact. Here the cost is dominated by materializing the BIGINT || '-' || BIGINT string for every row before the distinct/sketch step, so swapping exact for approximate barely moves the needle. Hashing the integer pair directly (e.g. "PULocationID" * 1000 + "DOLocationID") sidesteps it.
Impact: Low — this is a "shape your key as a number, not a string" lesson more than a bug. Filing because APPROX_COUNT_DISTINCT being ~equal to exact is a surprising result that suggests the string concat, not the cardinality estimation, dominates; worth a docs note or a look at whether string construction can be lazier under approximate aggregation.
Notes / context (not issues — positive findings)
- hyperdb beat DuckDB's pre-loaded native table on Q2/Q3/Q4/Q5/Q6/Q7/Q9/Q10 (global agg, low- and high-card GROUP BY, FILTER aggregates, PERCENTILE_CONT, window rank, EXTRACT grouping, GROUPING SETS) by 1.25–3.9×. GROUPING SETS (3.9×) and FILTER aggregates (2.6×) were the standouts.
- The only losses were Q1 (
COUNT(*), ~28 ms vs ~8 ms — trivial fixed overhead, not throughput) and Q8 exact distinct (the string-key issue above).
PERCENTILE_CONT ... WITHIN GROUP, FILTER (WHERE ...), GROUPING SETS, APPROX_COUNT_DISTINCT, and all window functions worked as documented.
- Build path: 34
load_files at concurrency 8 into distinct per-file tables, then one CREATE TABLE bench AS ... UNION ALL ... (= 61 s), was stable. (The unsafe pattern is concurrent appends to a single table — avoided here.)
- Mixed-case TLC column names (
VendorID, PULocationID, DOLocationID, RatecodeID) require double-quoting (42703 otherwise) — expected for a PostgreSQL-family dialect, noted for completeness.
Summary
Follow-up from the same NYC TLC Yellow Taxi exercise as StefanSteiner#1, this time a head-to-head benchmark of
hyperdb-mcpvs DuckDB over the full normalized dataset (299,214,146 rows, 34 Parquet files, Jan + Jul × 2009–2025). hyperdb performed very well — it won 8 of 10 compute queries against DuckDB's own pre-loaded native table, by 1.25–3.9× — but the suite surfaced one dialect blocker and one perf papercut worth filing.Environment:
hyperdb-mcp(this repo), driven from Claude Code. Dialect is Salesforce Data Cloud SQL / Tableau Hyper. Data was warm in the daemon (onebenchtable). Timings are engine-sidestats.elapsed_ms.1. (Dialect gap)
QUALIFYis not supported —42601 syntax errorA standard "top-N per group" query using
QUALIFYon a window function fails:The documented subquery fallback works fine:
Impact: Minor functionally (the fallback is mechanical), but
QUALIFYis a very common idiom and is supported by DuckDB, Snowflake, BigQuery, and Databricks. The MCP server instructions / dialect card advertise extensive window-function support (row_number, rank, modified_rank, IGNORE NULLS, frame modes, etc.) but do not flagQUALIFYas absent, so a user reasonably expects it to work.Suggestion: Support
QUALIFY, or explicitly note its absence (and the subquery-wrap workaround) in the dialect card alongside the window-function list.2. (Perf)
APPROX_COUNT_DISTINCTgives almost no speedup when the argument is a constructed stringExact vs approximate distinct over a 299M-row concatenated key were essentially the same:
For comparison, on DuckDB the approximate path is dramatically cheaper than exact. Here the cost is dominated by materializing the
BIGINT || '-' || BIGINTstring for every row before the distinct/sketch step, so swapping exact for approximate barely moves the needle. Hashing the integer pair directly (e.g."PULocationID" * 1000 + "DOLocationID") sidesteps it.Impact: Low — this is a "shape your key as a number, not a string" lesson more than a bug. Filing because
APPROX_COUNT_DISTINCTbeing ~equal to exact is a surprising result that suggests the string concat, not the cardinality estimation, dominates; worth a docs note or a look at whether string construction can be lazier under approximate aggregation.Notes / context (not issues — positive findings)
COUNT(*), ~28 ms vs ~8 ms — trivial fixed overhead, not throughput) and Q8 exact distinct (the string-key issue above).PERCENTILE_CONT ... WITHIN GROUP,FILTER (WHERE ...),GROUPING SETS,APPROX_COUNT_DISTINCT, and all window functions worked as documented.load_filesat concurrency 8 into distinct per-file tables, then oneCREATE TABLE bench AS ... UNION ALL ...(= 61 s), was stable. (The unsafe pattern is concurrent appends to a single table — avoided here.)VendorID,PULocationID,DOLocationID,RatecodeID) require double-quoting (42703otherwise) — expected for a PostgreSQL-family dialect, noted for completeness.