columnar: serialize CREATE INDEX on columnar tables for PG19#8618
columnar: serialize CREATE INDEX on columnar tables for PG19#8618ihalatci wants to merge 1 commit into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## pg19-support #8618 +/- ##
===============================================
Coverage ? 88.95%
===============================================
Files ? 288
Lines ? 64379
Branches ? 8093
===============================================
Hits ? 57268
Misses ? 4779
Partials ? 2332 🚀 New features to boost your workflow:
|
a31f940 to
83042a9
Compare
9d9144a to
3a0a90e
Compare
83042a9 to
59c5dc3
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses PostgreSQL 19’s default parallel CREATE INDEX behavior to ensure columnar TableAM index builds run safely in a serial mode and no longer fail due to missing/unsupported parallel-scan plumbing.
Changes:
- Implement parallel-scan callback support for columnar by delegating initialization/reinitialization to the block-table helper callbacks (to satisfy PG19 callers that size/init parallel scan state unconditionally).
- Ensure columnar index builds ignore any provided parallel scan descriptor and proceed via the serial scan path.
- Force
max_parallel_maintenance_workers=0forCREATE INDEXtargeting columnar relations by temporarily nesting/restoring the GUC aroundPrevProcessUtilityHook.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| static Size | ||
| columnar_parallelscan_estimate(Relation rel) | ||
| { | ||
| elog(ERROR, "columnar_parallelscan_estimate not implemented"); | ||
| return sizeof(ParallelBlockTableScanDescData); | ||
| } |
3a0a90e to
2ff1cfd
Compare
PG19 enables parallel CREATE INDEX by default (max_parallel_maintenance_workers defaults to 2). Columnar's TableAM is always serial: rs_parallel is stored but never read, and the build path flushes pending writes, which is disallowed inside a parallel operation. Three changes: 1. Provide working parallelscan_estimate/initialize/reinitialize that delegate to the table_block_* helpers, so callers that unconditionally size and initialize a ParallelTableScanDesc (e.g. PG19's parallel btree build) do not abort. Columnar ignores the descriptor, so the state is simply unused. 2. In columnar_index_build_range_scan, accept a parallel scan descriptor by discarding it (scan = NULL) and running the serial path. 3. In ColumnarProcessUtility, when the statement is CREATE INDEX on a columnar AM relation, force max_parallel_maintenance_workers=0 via NewGUCNestLevel/AtEOXact_GUC around PrevProcessUtilityHook so the build runs serially without triggering "cannot update tuples during a parallel operation". DESCRIPTION: Serialize CREATE INDEX on columnar tables on PG19 Closes #8611 Part of #8597
59c5dc3 to
c70d228
Compare
| saveNestLevel = NewGUCNestLevel(); | ||
| (void) set_config_option("max_parallel_maintenance_workers", "0", | ||
| PGC_USERSET, PGC_S_SESSION, | ||
| GUC_ACTION_SAVE, true, 0, false); |
There was a problem hiding this comment.
Should the removal of parallelism also be applied for REINDEX statements ? They have a separate node tag, T_ReindexStmt, which is not directly handled by this switch statement afaict, so falls to the default branch. As is, would a REINDEX of a columnar table with pending writes get parallel mode and hit the failure ?
| if (indexBuildOnColumnar) | ||
| { | ||
| saveNestLevel = NewGUCNestLevel(); | ||
| (void) set_config_option("max_parallel_maintenance_workers", "0", |
There was a problem hiding this comment.
nit: (void) goes against existing calls of set_config_option() (e.g. in PostprocessReassignOwnedStmt(), citus_internal_database_command())
| AtEOXact_GUC(true, saveNestLevel); | ||
| } | ||
| } | ||
| PG_END_TRY(); |
There was a problem hiding this comment.
Wondering if the current tests in columnar_index.sql are enough to validate the fixes ?
A couple of test suggestions:
- set max_parallel_maintenance_workers to e..g 4 before a
CREATE INDEXon a columnar table, and then show that it is still 4 after theCREATE INDEX. - Similar GUC setting for a REINDEX operation;
SET LOCAL debug_parallel_query = regress;
SET LOCAL max_parallel_workers = 4;
REINDEX TABLE parallel_scan_test;
REINDEX TABLE CONCURRENTLY parallel_scan_test;
SHOW LOCAL max_parallel_workers;
Stacked PR (PR3 of the PG19 stack). Base =
pg19-runtime-fixes(#8617) →pg19-ci-test-matrices(#8616) →pg19-ruleutils-port(#8602) →pg19-support. Review/merge in stack order.PG19 enables parallel
CREATE INDEXby default (max_parallel_maintenance_workersdefaults to 2). Columnar's TableAM is always serial —rs_parallelis stored but never read, and the build path flushes pending writes, which is disallowed inside a parallel operation. This causedCREATE INDEXon columnar tables to fail on PG19.Three changes (all in
columnar_tableam.c):parallelscan_estimate/initialize/reinitializenow delegate to thetable_block_*helpers, so callers that unconditionally size and initialize aParallelTableScanDesc(e.g. PG19's parallel btree build) no longer abort. Columnar ignores the descriptor, so the written state is simply unused.columnar_index_build_range_scanaccepts a parallel scan descriptor by discarding it (scan = NULL) and running the serial path.ColumnarProcessUtilityforcesmax_parallel_maintenance_workers=0(viaNewGUCNestLevel/AtEOXact_GUCaroundPrevProcessUtilityHook) forCREATE INDEXon a columnar AM relation, so the build runs serially.Closes #8611
Part of #8597