-
Notifications
You must be signed in to change notification settings - Fork 122
Memory model changes preliminary to CPU-host remote solve #807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
e3bd4f2
Add memory-model handling for LP/MIP solutions
tmckayus 710e2fa
LP/MIP host memory for remote solves and stubs
tmckayus 3d28ef7
python host solutions and lazy import in init
tmckayus 3ec5380
add memory model summary
tmckayus 8e9acc3
fix missing import in linear programming __init__.pyw
tmckayus 0eb63c7
resolve circular dependency likely exposted by lazy init
tmckayus f4562b7
fix missing variable types for quadratic problems
tmckayus 7c11efd
add ci fixes for empty mip problems and circular python imports
tmckayus fd178e6
check_style fix for copyright
tmckayus a140c20
add fixes for CI tests (emmpty problems)
tmckayus 6bc8419
Add comprehensive unit tests for memory model changes
tmckayus 780ac29
copyright check_style issues
tmckayus 249d8e2
implement CodeRabbit suggestions on the PR
tmckayus e602560
Address remaining CodeRabbit suggestions for memory model
tmckayus 0e70c61
Remove auto-generated MANIFEST.in build artifact
tmckayus 3475ced
Fix errors introduced in update to latest release/26.02
tmckayus f4c73c3
additional updates to memory model changes suggested by CodeRabbit
tmckayus 649191b
Consolidate data model conversion functions and remove unused code
tmckayus 84c081f
Refactor solve helpers to use data_model_view_t and add validation gu…
tmckayus a032ec6
fix check_style errors and address coderabbit comments
tmckayus 73c899a
Refactor C API to reuse cpu_problem_data_t, eliminating duplicate code
tmckayus ab14fb0
Add missing include for cuopt_assert macro
tmckayus File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| # Memory Model Summary | ||
|
|
||
| This document describes how memory is handled for **local** vs **remote** solves. | ||
|
|
||
| ## Core idea | ||
|
|
||
| The solver now supports **two memory modes** for problem input and solution output: | ||
|
|
||
| - **Device (GPU) memory**: used for local solves. | ||
| - **Host (CPU) memory**: used for remote solves. | ||
|
|
||
| A non-owning `data_model_view_t` (host or device) is the entry point that drives the path: | ||
| - Device view → local GPU solve | ||
| - Host view + `CUOPT_REMOTE_*` → remote solve path | ||
|
|
||
| ## Local solve (GPU memory) | ||
|
|
||
| **C++ entry points** (`solve_lp`, `solve_mip` in `cpp/src/linear_programming/solve.cu` and | ||
| `cpp/src/mip/solve.cu`) behave as follows: | ||
|
|
||
| 1. If the view is device memory, the view is converted into an `optimization_problem_t` and | ||
| solved locally. | ||
| 2. If the view is **host memory**, the data is copied **CPU → GPU** and solved locally. | ||
| This path requires a valid `raft::handle_t`. | ||
| 3. Solutions are returned in **device memory**, and wrappers expose device buffers. | ||
|
|
||
| **Python / Cython (`python/cuopt/.../solver_wrapper.pyx`)**: | ||
| - `DataModel` is **host-only**: the Cython wrapper accepts `np.ndarray` inputs and raises | ||
| if GPU-backed objects are provided. | ||
| - For local solves, host data is **copied to GPU** when building the | ||
| `optimization_problem_t` (requires a valid `raft::handle_t`). | ||
| - The solution is wrapped into `rmm::device_buffer` and converted to NumPy arrays via | ||
| `series_from_buf`. | ||
|
|
||
| **CLI (`cpp/cuopt_cli.cpp`)**: | ||
| - Initializes CUDA/RMM for local solve paths. | ||
| - Uses `raft::handle_t` and GPU memory as usual. | ||
|
|
||
| ## Remote solve (CPU memory) | ||
|
|
||
| Remote solve is enabled when **both** `CUOPT_REMOTE_HOST` and `CUOPT_REMOTE_PORT` are set. | ||
| This is detected early in the solve path. | ||
|
|
||
| **C++ entry points**: | ||
| - `solve_lp` / `solve_mip` check `get_remote_solve_config()` first. | ||
| - If input data is on **GPU** and remote is enabled, it is copied to CPU for serialization. | ||
| - If input data is already on **CPU**, it is passed directly to `solve_*_remote`. | ||
| - Remote solve returns **host vectors** and sets `is_device_memory = false`. | ||
|
|
||
| **Remote stub implementation** (`cpp/include/cuopt/linear_programming/utilities/remote_solve.hpp`): | ||
| - Returns **dummy host solutions** (all zeros). | ||
| - Sets termination stats to **finite values** (no NaNs) for predictable output. | ||
|
|
||
| **Python / Cython (`python/cuopt/.../solver_wrapper.pyx`)**: | ||
| - **Input handling**: builds a `data_model_view_t` from the Python `DataModel` before calling C++. | ||
| - **Solution handling**: for remote solves, the solution is **host memory**, so NumPy arrays | ||
| are built directly from host vectors and **avoid `rmm::device_buffer`** (no CUDA). | ||
|
|
||
| **CLI (`cpp/cuopt_cli.cpp`)**: | ||
| - Detects remote solve **before** any CUDA initialization. | ||
| - Skips `raft::handle_t` creation and GPU setup when remote is enabled. | ||
| - Builds the problem in **host memory** for remote solves. | ||
|
|
||
| ## Batch solve | ||
|
|
||
| Batch solve uses the same memory model: | ||
|
|
||
| - **Local batch**: GPU memory, with CUDA resources and PDLP/dual simplex paths. | ||
| - **Remote batch**: each problem is routed through `solve_lp_remote` or `solve_mip_remote` | ||
| and returns host data. If inputs are already on GPU, they are copied to host first. | ||
|
|
||
| ## Expected outputs for remote stubs | ||
|
|
||
| - Termination status: `Optimal` | ||
| - Objective values: `0.0` | ||
| - Primal/dual/reduced-cost vectors: zero-filled host arrays | ||
|
|
||
| This is useful for verifying the **CPU-only data path** without a remote service. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,58 @@ | ||
| /* clang-format off */ | ||
| /* | ||
| * SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| * SPDX-License-Identifier: Apache-2.0 | ||
| */ | ||
| /* clang-format on */ | ||
|
|
||
| #pragma once | ||
|
|
||
| /** | ||
| * @file data_model_view.hpp | ||
| * @brief Provides data_model_view_t in the cuopt::linear_programming namespace. | ||
| * | ||
| * This header provides access to the data_model_view_t class, a non-owning view | ||
| * over LP/MIP problem data. The view uses span<T> to hold pointers that can | ||
| * reference either host or device memory, making it suitable for both local | ||
| * GPU-based solves and remote CPU-based solves. | ||
| * | ||
| * The canonical implementation lives in cuopt::mps_parser for historical reasons | ||
| * and to maintain mps_parser as a standalone library. This header provides | ||
| * convenient aliases in the cuopt::linear_programming namespace. | ||
| */ | ||
|
|
||
| #include <mps_parser/data_model_view.hpp> | ||
| #include <mps_parser/utilities/span.hpp> | ||
|
|
||
| namespace cuopt::linear_programming { | ||
|
|
||
| /** | ||
| * @brief Non-owning span type that can point to either host or device memory. | ||
| * | ||
| * This is an alias to the span type defined in mps_parser. The span holds | ||
| * a pointer and size, but does not own the underlying memory. | ||
| * | ||
| * @tparam T Element type | ||
| */ | ||
| template <typename T> | ||
| using span = cuopt::mps_parser::span<T>; | ||
|
|
||
| /** | ||
| * @brief Non-owning view of LP/MIP problem data. | ||
| * | ||
| * This is an alias to the data_model_view_t defined in mps_parser. | ||
| * The view stores problem data (constraint matrix, bounds, objective, etc.) | ||
| * as span<T> members, which can point to either host or device memory. | ||
| * | ||
| * Key features for remote solve support: | ||
| * - Non-owning: does not allocate or free memory | ||
| * - Memory-agnostic: spans can point to host OR device memory | ||
| * - Serializable: host data can be directly serialized for remote solve | ||
| * | ||
| * @tparam i_t Integer type for indices (typically int) | ||
| * @tparam f_t Floating point type for values (typically float or double) | ||
| */ | ||
| template <typename i_t, typename f_t> | ||
| using data_model_view_t = cuopt::mps_parser::data_model_view_t<i_t, f_t>; | ||
|
|
||
| } // namespace cuopt::linear_programming | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this alias, the data_model_view_t could be moved to a small common library shared by mps parser and libcuopt. Currently mps parser is also statically linked into libcuopt to avoid runtime path issues. A small common library would eliminate that too, but that could be done as a second pass.