-
Notifications
You must be signed in to change notification settings - Fork 32
Add cudf table packing utilities with memory type support #843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nirandaperera
wants to merge
10
commits into
rapidsai:main
Choose a base branch
from
nirandaperera:better_cudf_pack_usage2
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
9e41618
init
nirandaperera bdf915a
adding pack and chunked pack
nirandaperera d174514
adding chunked pack for host packing
nirandaperera 334b5d3
minor
nirandaperera 977c2cf
Merge branch 'main' of github.com:rapidsai/rapidsmpf into better_cudf…
nirandaperera 00fa4a8
fix test
nirandaperera 8d4cda7
adding a special case for small tables
nirandaperera 570d1ff
addressing PR comments
nirandaperera 6b3d0ec
Merge branch 'main' of github.com:rapidsai/rapidsmpf into better_cudf…
nirandaperera 0f474cd
Merge branch 'main' into better_cudf_pack_usage2
nirandaperera File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,150 @@ | ||
| /** | ||
| * SPDX-FileCopyrightText: Copyright (c) 2024-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
| #pragma once | ||
|
|
||
| #include <memory> | ||
|
|
||
| #include <cudf/table/table_view.hpp> | ||
| #include <rmm/cuda_stream_view.hpp> | ||
|
|
||
| #include <rapidsmpf/memory/buffer.hpp> | ||
| #include <rapidsmpf/memory/memory_reservation.hpp> | ||
| #include <rapidsmpf/memory/memory_type.hpp> | ||
| #include <rapidsmpf/memory/packed_data.hpp> | ||
|
|
||
| namespace rapidsmpf { | ||
|
|
||
|
|
||
| /** | ||
| * @brief Pack a cudf table view into a contiguous buffer using chunked packing. | ||
| * | ||
| * This function serializes the given table view into a `PackedData` object | ||
| * using a bounce buffer for chunked transfer. This is useful when packing to | ||
| * host memory to avoid allocating temporary device memory for the entire table. | ||
| * | ||
| * @param table The table view to pack. | ||
| * @param stream CUDA stream used for device memory operations and kernel launches. | ||
| * @param bounce_buffer Device buffer used as intermediate storage during chunked packing. | ||
| * @param pack_temp_mr Temporary memory resource used for packing. | ||
| * @param reservation Memory reservation to use for allocating the packed data buffer. | ||
| * @return A unique pointer to the packed data containing the serialized table. | ||
| * | ||
| * @throws rapidsmpf::reservation_error If the allocation size exceeds the reservation. | ||
| * | ||
| * @see cudf::chunked_pack | ||
| */ | ||
| [[nodiscard]] std::unique_ptr<PackedData> chunked_pack( | ||
| cudf::table_view const& table, | ||
| rmm::cuda_stream_view stream, | ||
| rmm::device_buffer& bounce_buffer, | ||
| rmm::device_async_resource_ref pack_temp_mr, | ||
| MemoryReservation& reservation | ||
| ); | ||
|
|
||
| namespace detail { | ||
|
|
||
| /** | ||
| * @brief Pack a cudf table view into a contiguous device buffer. | ||
| * | ||
| * Uses cudf::pack(). Returns a `PackedData` with a `Buffer` backed by | ||
| * `rmm::device_buffer`. The memory is allocated using the provided reservation. | ||
| * | ||
| * @param table The table view to pack. | ||
| * @param stream CUDA stream used for device memory operations and kernel launches. | ||
| * @param reservation Device memory reservation. Must have memory type DEVICE. | ||
| * @return A unique pointer to the packed data containing the serialized table. | ||
| * | ||
| * @throws std::invalid_argument If the reservation's memory type is not DEVICE. | ||
| * @throws rapidsmpf::reservation_error If the allocation size exceeds the reservation. | ||
| * | ||
| * @see cudf::pack | ||
| */ | ||
| [[nodiscard]] std::unique_ptr<PackedData> pack_device( | ||
| cudf::table_view const& table, | ||
| rmm::cuda_stream_view stream, | ||
| MemoryReservation& reservation | ||
| ); | ||
|
|
||
| /** | ||
| * @brief Pack a cudf table view into a contiguous pinned host buffer. | ||
| * | ||
| * Uses cudf::pack() with a pinned memory resource. Returns a `PackedData` with | ||
| * a `Buffer` backed by a pinned `HostBuffer`. The memory is allocated using | ||
| * the provided reservation. | ||
| * | ||
| * @param table The table view to pack. | ||
| * @param stream CUDA stream used for device memory operations and kernel launches. | ||
| * @param reservation Pinned host memory reservation. Must have memory type PINNED_HOST. | ||
| * @return A unique pointer to the packed data containing the serialized table. | ||
| * | ||
| * @throws std::invalid_argument If the reservation's memory type is not PINNED_HOST. | ||
| * @throws rapidsmpf::reservation_error If the allocation size exceeds the reservation. | ||
| * | ||
| * @see cudf::pack | ||
| */ | ||
| [[nodiscard]] std::unique_ptr<PackedData> pack_pinned_host( | ||
| cudf::table_view const& table, | ||
| rmm::cuda_stream_view stream, | ||
| MemoryReservation& reservation | ||
| ); | ||
|
|
||
| /** | ||
| * @brief Pack a cudf table view into a contiguous host buffer. | ||
| * | ||
| * Uses cudf::chunked_pack() with a device bounce buffer when available, | ||
| * otherwise a pinned bounce buffer. Returns a `PackedData` with a `Buffer` | ||
| * backed by a `HostBuffer`. The memory is allocated using the provided reservation. | ||
| * | ||
| * Algorithm: | ||
| * 1. Special case: empty tables return immediately with empty packed data. | ||
| * 2. Fast path for small tables (< 1MB): pack directly on device and copy to host. | ||
| * 3. Estimate the table size (est_size), with a minimum of 1MB. | ||
| * 4. Try to reserve device memory for est_size with overbooking allowed. | ||
| * 5. If available device memory (reservation - overbooking) >= 1MB, | ||
| * use chunked packing with the device bounce buffer. | ||
| * 6. Otherwise, if pinned memory is available, retry with pinned memory (steps 4-5). | ||
| * 7. If all attempts fail, throw an error. | ||
| * | ||
| * @param table The table view to pack. | ||
| * @param stream CUDA stream used for device memory operations and kernel launches. | ||
| * @param reservation Host memory reservation. Must have memory type HOST. | ||
| * @return A unique pointer to the packed data containing the serialized table. | ||
| * | ||
| * @throws std::invalid_argument If the reservation's memory type is not HOST. | ||
| * @throws rapidsmpf::reservation_error If the allocation size exceeds the reservation. | ||
| * | ||
| * @see cudf::chunked_pack | ||
| */ | ||
| [[nodiscard]] std::unique_ptr<PackedData> pack_host( | ||
| cudf::table_view const& table, | ||
| rmm::cuda_stream_view stream, | ||
| MemoryReservation& reservation | ||
| ); | ||
|
|
||
| } // namespace detail | ||
|
|
||
| /** | ||
| * @brief Pack a cudf table view into a contiguous buffer. | ||
| * | ||
| * This function serializes the given table view into a `PackedData` object | ||
| * with the data buffer residing in the memory type of the provided reservation. | ||
| * The memory for the packed data is allocated using the provided reservation. | ||
| * | ||
| * @param table The table view to pack. | ||
| * @param stream CUDA stream used for device memory operations and kernel launches. | ||
| * @param reservation Memory reservation to use for allocating the packed data buffer. | ||
| * @return A unique pointer to the packed data containing the serialized table. | ||
| * | ||
| * @throws rapidsmpf::reservation_error If the allocation size exceeds the reservation. | ||
| * | ||
| * @see cudf::pack | ||
| */ | ||
| [[nodiscard]] std::unique_ptr<PackedData> pack( | ||
| cudf::table_view const& table, | ||
| rmm::cuda_stream_view stream, | ||
| MemoryReservation& reservation | ||
| ); | ||
|
|
||
| } // namespace rapidsmpf |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Why not just return a
std::arrayof the two memory types?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to make sure that if we meddle with
MEMORY_TYPESarray in the future, we fail statically if somethings not right.