⚡ Bolt: fast-path for single-batch build side in HashJoin#272
⚡ Bolt: fast-path for single-batch build side in HashJoin#272
Conversation
This optimization identifies and eliminates redundant work during the build phase of a hash join, specifically when dealing with a single input batch (a very common case for broadcast joins or small tables). Key improvements: - Avoids unnecessary `concat_batches` call for single-batch inputs. - Eliminates redundant evaluations of join key expressions. Previously, keys could be evaluated up to 4 times (PHJ check, hash map construction, PHJ null check, and final batch preparation). Now they are evaluated once and reused. - Refactored PHJ applicability logic into cleaner helper functions. Measurable impact: - Reduced CPU usage during the build phase of HashJoin for small/broadcast tables. - Reduced peak memory usage by avoiding shallow copies and temporary concatenations. - Faster query startup for joins with small build-side tables. Co-authored-by: Dandandan <163737+Dandandan@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
This PR implements a performance fast-path for
HashJoinExecwhen the build-side input consists of only a singleRecordBatch.In the original implementation, the build side would always undergo a
concat_batchesoperation and multiple redundant evaluations of the join key expressions (once during streaming to compute bounds, once during PHJ check, and once more for the final concatenated batch).The optimized implementation:
concat_batchesby using the single batch directly.This reduces both CPU overhead (expression evaluation and concatenation logic) and minor memory allocations during the hash join build phase. Tests have verified correctness across standard and perfect hash join paths.
PR created automatically by Jules for task 16763121814263366103 started by @Dandandan