Add withRetry to GpuGenerateExec and GpuTextBasedPartitionReader#13996
Add withRetry to GpuGenerateExec and GpuTextBasedPartitionReader#13996firestarman merged 26 commits intoNVIDIA:mainfrom
Conversation
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Greptile SummaryThis PR adds OOM retry protection to two GPU memory allocation hotspots: Key changes:
These changes contribute to issue #13672's goal of covering all memory allocation points with retry protection. Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant GpuGenerateExec
participant GpuGenerateUtils
participant withRetry
participant GpuGenerateIterator
participant Generator
Caller->>GpuGenerateExec: doGenerateAndClose(input)
GpuGenerateExec->>GpuGenerateExec: projectAndCloseWithRetrySingleBatch()
GpuGenerateExec->>GpuGenerateUtils: getSplitsWithRetryAndClose(projectedInput)
GpuGenerateUtils->>GpuGenerateUtils: Create SpillableColumnarBatch
GpuGenerateUtils->>withRetry: withRetry(batch, splitFunction)
alt OOM occurs
withRetry->>withRetry: Split batch in half
withRetry->>GpuGenerateUtils: Retry with smaller batch
end
withRetry->>Generator: inputSplitIndices()
Generator-->>withRetry: splitIndices
withRetry->>GpuGenerateUtils: makeSplits(batch, indices)
GpuGenerateUtils-->>GpuGenerateExec: Iterator[Array[SpillableColumnarBatch]]
GpuGenerateExec->>GpuGenerateIterator: new(splits, generator)
GpuGenerateIterator->>Caller: Iterator[ColumnarBatch]
loop For each output batch
Caller->>GpuGenerateIterator: hasNext/next()
alt generateIter is empty
GpuGenerateIterator->>GpuGenerateIterator: Get next bundle from inputs
GpuGenerateIterator->>Generator: generate(safeIteratorFromSeq(bundle))
Generator-->>GpuGenerateIterator: Iterator[ColumnarBatch]
end
GpuGenerateIterator->>Generator: Call generateIter.next()
Generator-->>GpuGenerateIterator: ColumnarBatch
GpuGenerateIterator-->>Caller: ColumnarBatch
end
|
There was a problem hiding this comment.
Pull request overview
This PR adds retry framework support to three uncovered cases of GPU device memory allocation to prevent potential GPU OOM errors: batched bounded window computation, CSV cast operations, and generate split calculations.
Key changes:
- Wrapped batched bounded window computation with
withRetryNoSplitto handle OOM during window operations - Added retry logic to CSV cast table operations with
withRetryNoSplit - Enhanced generate getSplits with
withRetryusing adaptive target size splitting on OOM
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| sql-plugin/src/main/scala/com/nvidia/spark/rapids/window/GpuBatchedBoundedWindowExec.scala | Wraps bounded window computation with withRetryNoSplit to handle OOM errors |
| sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuTextBasedPartitionReader.scala | Adds retry logic to castTableToDesiredTypes for CSV parsing |
| sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuGenerateExec.scala | Implements retry with adaptive target size splitting for getSplits operation |
| tests/src/test/scala/com/nvidia/spark/rapids/WindowRetrySuite.scala | Adds test cases for bounded window retry on GpuRetryOOM and GpuSplitAndRetryOOM |
| tests/src/test/scala/com/nvidia/spark/rapids/GpuGenerateSuite.scala | Adds test for generate split-and-retry OOM handling |
| tests/src/test/scala/com/nvidia/spark/rapids/CsvScanRetrySuite.scala | Adds test for CSV cast table retry on OOM |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
build |
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
|
@greptile full review |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…xec.scala Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Greptile's behavior is changing!From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section. This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR". |
|
build |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
…PartitionReader.scala Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
build |
|
LGTM, just nits. |
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
|
build |
|
LGTM |
| * A INT32 column in cuDF may be from either YearMonthIntervalType or IntegerType | ||
| * in Spark. | ||
| */ | ||
| def infer(col: ColumnView): DataType = col.getType match { |
There was a problem hiding this comment.
I would very much prefer it if we had a SpillableTable or SpillableCudfColumnArray instead of a SpillableColumanrBatch for this. The only reason we do a lot of SpillableColumnarBatch is because we know the Spark types almost everywhere. But internally they are only ever kept in CPU memory so we can recreate the same object as before. There is no reason to make up bogus Spark types so we can cache them in memory just so that we can pick them apart and then put them back together again afterwards. Or worse we end up using those fake types in places we should not.
There was a problem hiding this comment.
This requires more work than I thought. It takes some time. I am going to split it into a separate PR.
There was a problem hiding this comment.
I agree that this is a lot of work. I am fine if we keep this as is for now and do the work in a follow on issue, so long as you make it very clear from the comments that this should not be used the way it is today anywhere else. That or you make a wrapper class to have a SpillableTable, that hides this as private methods with very clear comments about why you are doing it and what is the issue that is filed to clean it up/fix it.
There was a problem hiding this comment.
It is ok, I am adding in the SpillableTable with the AI help.
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
There was a problem hiding this comment.
This is now a problem. A kind of minor problem because we want to get rid of the old legacy op time, but the NVTX range does not actually cover all of the cases for the spit, because it can be computed lazily when the iterator calls next.
There was a problem hiding this comment.
Thx, and updated to calculate the op time at two separate places where project and split are really executed.
Signed-off-by: Firestarman <firestarmanllc@gmail.com>
….scala Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
revans2
left a comment
There was a problem hiding this comment.
Just a nit at this point
| getSplits(projectedInput, othersProjectList, new RapidsConf(conf).gpuTargetBatchSizeBytes) | ||
| } | ||
| val splits = GpuGenerateUtils.getSplitsWithRetryAndClose(projectedInput, generator, | ||
| othersProjectList.length, outer, new RapidsConf(conf).gpuTargetBatchSizeBytes, opTime) |
There was a problem hiding this comment.
new RapidsConf(conf) is not a cheap operation. At a minimum can we cache the result so that we don't take the hit for all batches?
There was a problem hiding this comment.
It is always good to make the code better even it is not related to the main goal of the current PR.
|
NOTE: release/26.02 has been created from main. Please retarget your PR to release/26.02 if it should be included in the release. |
|
build |
This PR is filed to address some follow-ups for two recent PRs, including: - add in the `SpillableTable` to support spilling a cudf `Table` directly, avoiding converting it to a `ColumnarBatch`. This now is designed for the text based read when it does the schema casting, where we do not need a `ColumnarBatch`. `SpillabltTableHandle` has almost the same logic as the `SpillableColumnarBatchHandle`,the only difference is the internal `dev` is a `Table`, not a `ColumnarBatch`. (comment: #13996 (comment)) - add the comments to the `ceilDiv` in pre-split (comment: #14190 (comment)) - change to get the target batch size once in `GpuGenerateExec`. (comment: #13996 (comment)) Other changes are in tests. - Move all the common methods to a new trait named `SpillUnitTestBase` to share with its children. - Add unit tests for the new SpillableTable - Update `SpillFrameworkSuite` to extend from the new `SpillUnitTestBase`. --------- Signed-off-by: Firestarman <firestarmanllc@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Contributes to #13672
Description
This PR covered 2 cases of device memory allocation with retry framework, to prevent potential GPU OOM. They are top cases found by running integration tests with #13995 enabled, covered 50% of found cases.
Also added test cases for them.
They should be easy ones so I combined them to one PR, but I'm happy to split them.
Checklists
(Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)