EXPLAIN / EXPLAIN ANALYZE / EXPLAIN (FORMAT JSON)#23
Merged
Conversation
…t + DuckDB-shape JSON
Before this change, `EXPLAIN` over the pgwire interface was effectively
unusable: `format-explain-result` rendered five summary lines that were
all blank for any node the legacy formatter didn't special-case (most
notably PSplitAgg), and `EXPLAIN ANALYZE` was rejected by the regex.
The plan-tree text printer also dropped predicates on PDenseGroupBy /
PPercentileAgg, so `EXPLAIN ... WHERE p` produced byte-identical output
to the no-WHERE plan — making it impossible to tell from the explain
that the filter was pushed down (the filter was applied at runtime).
This rewrite replaces the ad-hoc string formatter with a structured
data model and two renderers, and threads per-node timing through the
executor for ANALYZE.
plan.clj
- `plan->data` walks the physical/logical tree and returns a
serializable map per node:
{:op :node-id :est-rows :sel :details :children :child-tags
(:actual-rows :time-ms when ANALYZE)}
- `node-details` produces uniform `[label string-value]` pairs:
Filter / Group Keys / Aggregates / Columns / Join Type / Hash
Cond / Match Cond / Max Key / Extract / etc. Adding a node-type
now means extending one cond, not 12 ad-hoc printers.
- `render-text` produces a Postgres-style indented tree with
`-> ` arrows, labeled sub-lines, and a
`(est-rows=N sel=S actual-rows=M time=T.TTms)` cost suffix.
Renders the `Execution Time:` footer when ANALYZE data is present.
- `render-json` produces a DuckDB-shape `[{name, children,
extra_info}]` data structure with `__estimated_cardinality__`
/ `__cardinality__` / `__timing__` / `__selectivity__` internal
keys. Numbers stay JSON numbers, not "N rows" strings.
- `format-pred` / `format-agg` / `format-col-or-expr` render the
normalized IR back to SQL-like text (`(a > 100)`, `sum(price)`,
`(a * b)`).
executor.clj
- `*explain-collector*` dynamic var (atom). When bound,
`execute-node` records `(System/nanoTime)` deltas and output
row count per call, keyed by node identity-hash.
- `execute-node` now splits into a public wrap (measures + records)
and `execute-node-impl` (the dispatch cond). Inclusive timing:
every recursive child call also goes through the wrap, so a
parent's recorded time naturally includes its children's —
matching Postgres / DuckDB EXPLAIN ANALYZE semantics. Zero
overhead when the collector is nil (one nil check per call).
- `count-output` handles the three return shapes
(column ctx, columnar result, row vec).
- `explain-analyze-query` runs the query under the collector,
merges per-node timings into the data tree, and adds a root
`:execution-time-ms`.
sql.clj
- `parse-explain-prefix` accepts the full DuckDB/Postgres grammar:
EXPLAIN <sql>
EXPLAIN ANALYZE <sql>
EXPLAIN VERBOSE <sql>
EXPLAIN (ANALYZE) <sql>
EXPLAIN (FORMAT JSON) <sql>
EXPLAIN (ANALYZE, FORMAT JSON) <sql>
- Returns the new richer shape:
{:explain {:options {:analyze? :format} :inner {:query|:system ...}}}
so the wire layer can dispatch on format without re-parsing.
server.clj
- `format-explain-result` rewritten to consume the new shape and
dispatch on `:format`. Text: one pgwire row per tree line under
column "QUERY PLAN" (matches Postgres' EXPLAIN output exactly).
JSON: a single row with the pretty-printed JSON document.
- Inline `json-write-str` avoids a new dep on `clojure.data.json`.
api.clj / query.clj
- `explain` accepts `{:analyze? :format}` opts. Result map adds
`:plan-data` (structured tree), `:plan-tree` (text render),
`:plan-json` (DuckDB-shape data) when JSON requested,
`:execution-time-ms` when ANALYZE. Legacy `:strategy` /
`:n-rows` / `:columns` keys preserved so existing callers don't
break. SQL EXPLAIN options auto-propagate when calling
`api/explain` with an `EXPLAIN ...` prefix string.
Tests: 12 new tests / 51 assertions in `test/stratum/explain_test.clj`
covering text shape, JSON shape, ANALYZE timing presence, all six SQL
syntax variants, end-to-end SQL prefix flow, legacy key back-compat,
and a direct regression test that PSplitAgg + WHERE renders
differently from PSplitAgg alone (the original bug). One existing
test updated (`asof_join_test.clj`: `match=` → `Match Cond:`). Full
suite passes: 1024 tests, 4731 assertions, 0 failures.
Out of scope for this PR (deliberate):
- Logical-vs-optimized-vs-physical multi-plan output. We only have
one physical plan, so DuckDB's three-banner output is overkill.
- HTML / GRAPHVIZ / YAML / MERMAID formats. Tooling can consume the
JSON.
- Buffer / IO stats. No instrumentation today.
- Per-thread breakdown of morsel-parallel nodes. Single wall-clock
span per node is what users want for latency triage.
Signed-off-by: Christian Weilbach <christian@weilbach.name>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EXPLAIN ANALYZEnow runs the query under instrumentation and surfaces per-nodeactual-rows+time-msplus anExecution Time:footer — inclusive timing per Postgres/DuckDB convention.EXPLAIN,EXPLAIN ANALYZE,EXPLAIN VERBOSE,EXPLAIN (ANALYZE),EXPLAIN (FORMAT JSON),EXPLAIN (ANALYZE, FORMAT JSON).EXPLAIN ... WHERE ppreviously produced byte-identical output to the no-WHERE plan on PSplitAgg sub-plans (filter was applied at runtime, just dropped from the printed tree). Includes a direct regression test.What changed
plan.cljplan->datadata model,node-detailsuniform[label, value]pairs,render-text(Postgres-style),render-json(DuckDB shape),format-pred/format-aggSQL-like printersexecutor.clj*explain-collector*dynamic var;execute-nodesplit into wrap + impl for inclusive per-node timing;explain-analyze-queryentry pointsql.cljparse-explain-prefixhandles all six modifier syntaxes; new richer return shape{:options {...} :inner {...}}server.cljformat-explain-resultrewritten to dispatch on:format; one pgwire row per tree line for text, single row for JSON; inlinejson-write-str(no new dep)api.clj/query.clj:analyze?/:formatopts; legacy:strategy/:n-rows/:columnskeys preservedtest/stratum/explain_test.cljtest/stratum/asof_join_test.cljmatch=→Match Cond:to match the new labeled formatSample output
EXPLAIN ANALYZE SELECT MIN(a), MEDIAN(a), MAX(a) FROM t WHERE a > 100:EXPLAIN (FORMAT JSON) ...:[{"name": "PSplitAgg", "children": [{"name": "PDenseGroupBy", ..., "extra_info": {"Aggregates": "[min(a), max(a)]", "Filter": "(a > 100)", "__estimated_cardinality__": 1000}}, ...], "extra_info": {"Aggregates": "[min(a), median(a), max(a)]", "Strategies": "2", "__estimated_cardinality__": 1}}]Out of scope (deliberate)
Test plan
clj -M:ffixapplied (cljfmt clean).EXPLAIN,EXPLAIN ANALYZE,EXPLAIN (FORMAT JSON)against a registered table.