Shard move in block_writes mode fails with idle_in_transaction_sessio…#8495
Merged
codeforall merged 1 commit intorelease-14.0from Mar 6, 2026
Merged
Shard move in block_writes mode fails with idle_in_transaction_sessio…#8495codeforall merged 1 commit intorelease-14.0from
codeforall merged 1 commit intorelease-14.0from
Conversation
…n_timeout on metadata workers (#8484) ### Description When performing a shard move using block_writes transfer mode (either directly via citus_move_shard_placement or through the background rebalancer), the operation can fail with: ``` ERROR: terminating connection due to idle-in-transaction timeout CONTEXT: while executing command on <worker_host>:<worker_port> ``` The failing worker is a metadata worker that is neither the source nor the target of the shard move. ### Root Cause LockShardListMetadataOnWorkers() opens coordinated transactions on all metadata workers to acquire advisory shard metadata locks via SELECT lock_shard_metadata(...). These transactions remain open until the entire shard move completes and the coordinated transaction commits. In block_writes mode, the data copy phase (CopyShardsToNode) runs synchronously between the source and target workers. Metadata workers not involved in the copy have no commands to execute and their connections sit completely idle-in-transaction for the entire duration of the data copy. For large shards, the copy can take significantly longer than common idle_in_transaction_session_timeout values, When the timeout fires on an uninvolved worker, PostgreSQL terminates the connection, causing the shard move to fail. This also affects shard splits, since they follow the same code path through LockShardListMetadataOnWorkers. ### Fix LockShardListMetadataOnWorkers() should send SET LOCAL idle_in_transaction_session_timeout = 0 on each metadata worker connection before acquiring the locks. SET LOCAL scopes the change to the current transaction only, so normal sessions on the workers are unaffected.
ihalatci
approved these changes
Mar 6, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release-14.0 #8495 +/- ##
================================================
- Coverage 88.73% 88.73% -0.01%
================================================
Files 287 287
Lines 63233 63235 +2
Branches 7922 7921 -1
================================================
+ Hits 56109 56110 +1
- Misses 4857 4858 +1
Partials 2267 2267 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…n_timeout on metadata workers (#8484)
Description
When performing a shard move using block_writes transfer mode (either directly via citus_move_shard_placement or through the background rebalancer), the operation can fail with:
The failing worker is a metadata worker that is neither the source nor the target of the shard move.
Root Cause
LockShardListMetadataOnWorkers() opens coordinated transactions on all metadata workers to acquire advisory shard metadata locks via SELECT lock_shard_metadata(...). These transactions remain open until the entire shard move completes and the coordinated transaction commits.
In block_writes mode, the data copy phase (CopyShardsToNode) runs synchronously between the source and target workers. Metadata workers not involved in the copy have no commands to execute and their connections sit completely idle-in-transaction for the entire duration of the data copy.
For large shards, the copy can take significantly longer than common idle_in_transaction_session_timeout values, When the timeout fires on an uninvolved worker, PostgreSQL terminates the connection, causing the shard move to fail.
This also affects shard splits, since they follow the same code path through LockShardListMetadataOnWorkers.
Fix
LockShardListMetadataOnWorkers() should send SET LOCAL idle_in_transaction_session_timeout = 0 on each metadata worker connection before acquiring the locks. SET LOCAL scopes the change to the current transaction only, so normal sessions on the workers are unaffected.
DESCRIPTION: PR description that will go into the change log, up to 78 characters