-
Notifications
You must be signed in to change notification settings - Fork 268
Add fast path for GpuNvl when lhs has no nulls #14050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Skip isNotNull + ifElse when lhs.getNullCount == 0 - Directly return lhs.incRefCount() for better performance Signed-off-by: Hongbin Ma (Mahone) <mahongbin@apache.org>
|
build |
Greptile SummaryOptimization that adds a fast path to Key changes:
Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant GpuNvl
participant ColumnVector as lhs ColumnVector
participant RHS as rhs (ColumnVector/Scalar)
Caller->>GpuNvl: apply(lhs, rhs)
GpuNvl->>ColumnVector: getNullCount()
alt lhs has no nulls (nullCount == 0)
Note over GpuNvl: Fast path optimization
GpuNvl->>ColumnVector: incRefCount()
ColumnVector-->>GpuNvl: return lhs with incremented ref
GpuNvl-->>Caller: return lhs
else lhs has nulls (nullCount > 0)
Note over GpuNvl: Original path
GpuNvl->>ColumnVector: isNotNull()
ColumnVector-->>GpuNvl: boolean mask
GpuNvl->>GpuNvl: ifElse(lhs, rhs) using mask
Note over GpuNvl: withResource closes mask
GpuNvl-->>Caller: return result
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a performance optimization to the GpuNvl object by introducing a fast path when the left-hand side column has no null values. Instead of performing the isNotNull check and ifElse operation, it directly returns the lhs column with an incremented reference count.
- Added fast path optimization that skips unnecessary null checks when lhs.getNullCount == 0
- Applied the same optimization to both overloaded apply methods (ColumnVector and Scalar variants)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Fast path: if lhs has no nulls, just return lhs directly | ||
| if (lhs.getNullCount == 0) { | ||
| lhs.incRefCount() | ||
| } else { |
Copilot
AI
Dec 22, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fast path optimization when lhs has no nulls should have explicit test coverage. Consider adding a test case in conditionals_test.py that exercises the scenario where the first argument to nvl() has nullable=False, to ensure this optimization path is tested. For example, you could add a test like test_nvl with binary_op_df using a data_gen with nullable=False.
revans2
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, but I don't see any performance numbers. Might be worth filing a follow on issue to look at all other null related expressions. In the past null count was something that was calculated lazily, but then CUDF changed it and we never went back and optimized things knowing that it is not cheap to use.
| if (lhs.getNullCount == 0) { | ||
| lhs.incRefCount() | ||
| } else { | ||
| withResource(lhs.isNotNull) { isLhsNotNull => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If we are doing this level of optimization, would it be worth testing to see the performance improvement for using the following APIs instead.
abellina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same questions as @revans2. Do note we need to update copyright.
This PR is broke up from #14032, please visit that to see more details