WIP: Add a new remote execution system that is integrated in the cuopt engine by tmckayus · Pull Request #769 · NVIDIA/cuopt

tmckayus · 2026-01-13T20:45:38Z

Description

This change adds a new remote execution system which will ultimately replace the cuopt Python server. In this new implementation, remote solves are identical to local solves except that the env vars CUOPT_REMOTE_HOST and CUOPT_REMOTE_PORT are set to indicate a remote server.

The remote server code is included and can be built optionally with "build.sh cuopt_remote_server". It does not build by default.

Internal code paths have been modified to use data_model_view_t with non-owning pointers to store problems and solutions as GPU memory or CPU memory, the former for local solves and the latter for remote solves.

Some markdown files have been added that describe the changes:

REMOTE_SOLVE_GUIDE.md (Basic understanding)
REMOTE_SOLVE_MODES.md (Protocol details)
remote_solve_architecture.md (Deep architecture)
SOLUTION_MEMORY_ARCHITECTURE.md (Memory management)
SERIALIZATION_PLUGIN_GUIDE.md (If extending)

Checklist

Need to:

evaluate the need for new tests
review the available markdowns and make sure the feature is well-documented

Summary by CodeRabbit

New Features
- Remote solve platform: sync & async solves, job submit/status/result, cancellation, streaming logs, pluggable serializers (Protobuf & MsgPack), CLI and C/Python client helpers, and lazy GPU submodule loading.
Documentation
- Added comprehensive remote-solve, serialization-plugin, solution-memory, and architecture guides.
Tests
- New unit/integration tests and example scripts exercising remote workflows, cancellation, and client-server interactions.
Chores
- New remote-server build target, Protobuf dependency and installer, packaging/build updates, and CI install step.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

This adds a header that provides convenient aliases for data_model_view_t and span<T> in the cuopt::linear_programming namespace. The canonical implementation remains in cuopt::mps_parser to maintain mps_parser as a standalone library.

- Refactored problem_and_stream_view_t to use unique_ptr for both CPU and GPU storage - Check is_remote_solve_enabled() at problem creation time - Local solve: store problem directly in GPU memory (optimization_problem_t) - Remote solve: store problem in CPU memory (problem_cpu_data_t) - cuOptSolve routes to appropriate path based on is_remote_configured flag - Updated all cuOptGet* functions to read from correct storage - No memory overhead: only one storage type is allocated per problem - Maintains backward compatibility with existing C API

- Add is_device_memory field to data_model_view_t to track memory location - Simplify solve_lp/solve_mip overloads: mps_data_model_t now creates view and delegates - C API checks remote config at creation time: GPU memory for local, CPU for remote - Handle edge case: GPU data + remote config copies to CPU with warning - Add white-box tests for all local/remote + CPU/GPU combinations

- Modify cython_solve.cu to check for remote solve configuration early - When remote is enabled, mark data_model as CPU memory and call solve_lp/solve_mip with data_model_view_t overloads - Fix segfault by initializing warm start data to empty device buffers instead of nullptr when returning from remote solve path - Local solve path unchanged - still creates GPU optimization_problem_t

…emote solve - Add is_device_memory flag to optimization_problem_solution_t and mip_solution_t - Use std::unique_ptr for both CPU (std::vector) and GPU (rmm::device_uvector) storage - Add _host() getters for C++ users to access CPU-resident solution data - Update cython_solve.cu to populate CPU or GPU storage based on solve type - Update Python wrapper to transparently handle both CPU and GPU solution data - Backward compatible: existing GPU-based code unchanged, Python users need no changes

- Check is_remote_solve_enabled() before any CUDA initialization in CLI - Skip raft::handle creation and memory resource setup for remote solve - Use CPU-only solution constructors when remote solve returns error - Handle nullptr handle_ptr gracefully in solve_lp/solve_mip - Enables CLI to run on hosts without GPU when remote solve is configured

- Make raft::handle_t in problem_and_stream_view_t lazy (unique_ptr) - Make stream_view in solution_and_stream_view_t optional - Check is_remote_solve_enabled() before creating handle in cuOptSolve - Update cuOptGetPrimalSolution/DualSolution/ReducedCosts to handle CPU data - Enables C API to run on hosts without GPU when remote solve is configured

- cython_solve.cu: Check remote solve before CUDA stream/handle creation - Remote solve path passes nullptr handle to solve_lp/solve_mip - Local solve path creates CUDA resources only when needed - cuopt/__init__.py: Make linear_programming and routing imports lazy - Uses __getattr__ for deferred module loading - Allows 'import cuopt' on CPU-only hosts with remote solve - solver_wrapper.pyx: Lazy CUDA imports and remote solve detection - Remove eager imports of cupy, cudf, numba.cuda - Add is_remote_solve_enabled() Python function - Add _get_cuda_imports() for lazy module loading - Update create_solution() to check is_device_memory_ for warm start - utilities/__init__.py: Lazy imports for CUDA-dependent utilities - type_cast, series_from_buf, check_solution via __getattr__ - utilities/utils.py: Lazy cudf/pylibcudf imports in series_from_buf Enables Python programs to run on CPU-only hosts when remote solve is configured via CUOPT_REMOTE_HOST and CUOPT_REMOTE_PORT.

Features: - Remote solve server (cuopt_remote_server) with protobuf serialization - Real-time log streaming from server to client via stdout capture - Transparent remote solve via CUOPT_REMOTE_HOST/PORT environment variables - Pluggable serialization interface (remote_serialization.hpp) - Static linking of libmps_parser into libcuopt for simpler deployment Components: - cuopt_remote_server.cpp: Server with streaming protocol - cuopt_remote.proto: Protocol Buffers schema for LP/MIP problems - protobuf_serializer.cu: Serialization implementation - remote_solve.cu: Client-side remote solve with streaming support Build: - Added BUILD_REMOTE_SERVER cmake option - Added cuopt_remote_server target to build.sh - Added mps_parser_static library for static linking The server captures stdout during solve and streams it to clients while also echoing to the server console (for docker logs, monitoring, etc.).

When users set constraints using set_row_types() + set_constraint_bounds() instead of set_constraint_lower/upper_bounds(), the serializer now computes the constraint lower/upper bounds from the row types: - 'E' (equality): lb = ub = rhs - 'L' (less-than): lb = -inf, ub = rhs - 'G' (greater-than): lb = rhs, ub = inf - 'N' (free): lb = -inf, ub = inf Also fixed server to always use streaming protocol for solutions (even when --no-stream is set, only log streaming is disabled).

- Integrate solve_mip_remote() into solve_mip() dispatch - Serialize variable types (integer/binary) to protobuf is_integer/is_binary fields - Deserialize variable types on server side - Fix is_mip_request() to check header.problem_type for reliable detection MIP remote solve now correctly handles integer and binary variables.

- Add to_host() method to LP/MIP solution classes for GPU->CPU copy - Server calls to_host() before serialization, simplifying serializer impl - Add dynamic serializer loading via CUOPT_SERIALIZER_LIB env var - Add MsgPack serializer plugin as proof-of-concept - Add serialization plugin development guide

Features: - Async job management: SUBMIT_JOB, CHECK_STATUS, GET_RESULT, DELETE_RESULT, GET_LOGS - Worker processes with shared memory job queues (fork-based, not threads) - File-based log streaming for async mode (matches Python server architecture) - Per-job log files: /tmp/cuopt_logs/log_{job_id} - Incremental reads via byte offset (frombyte parameter) - Automatic cleanup on job deletion - Sync mode still uses real-time socket streaming - CUOPT_REMOTE_USE_SYNC=1 env var to force sync mode Protocol additions: - RequestType.GET_LOGS for log retrieval - AsyncRequest.frombyte for incremental log reads - LogsResponse with log_lines, nbytes, job_exists Client modes: - Sync (blocking=true): Real-time log streaming over socket - Async (blocking=false): Submit -> Poll (logs + status) -> Get -> Delete

Server (cuopt_remote_server.cpp): - Add worker process monitoring with waitpid - Automatically restart dead workers - Mark in-progress jobs as FAILED when worker dies - Add flush calls to monitoring log messages Msgpack serializer: - Implement full async interface support - Add async message types (MSG_ASYNC_LP/MIP_REQUEST, etc.) - Implement server-side methods: is_async_request, is_blocking_request, extract_problem_data, get_async_request_type, get_job_id, get_frombyte - Implement response serialization: serialize_submit_response, serialize_status_response, serialize_result_response, serialize_delete_response, serialize_logs_response - Fix is_mip_request to detect both sync and async MIP formats Tested with LP (afiro.mps) and MIP (flugpl.mps) using msgpack async.

- docs/developer/REMOTE_SOLVE_GUIDE.md: Comprehensive guide covering server setup, client configuration, async/sync modes, job lifecycle, custom serialization, worker monitoring, and troubleshooting - docs/developer/SOLUTION_MEMORY_ARCHITECTURE.md: Documents the hybrid GPU/CPU memory strategy for solutions, including to_host() method, accessor patterns, and design rationale Deleted 56 obsolete markdown files from root directory that contained outdated information from previous development iterations.

Features: - CANCEL_JOB: Cancel queued jobs (marked cancelled, workers skip them) or running jobs (worker killed via SIGKILL, auto-restarted) - WAIT_FOR_RESULT: Blocking wait using condition variables (no polling) Server thread sleeps until job completes, fails, or is cancelled - cuopt_cancel_job: Standalone CLI utility to cancel jobs by ID Protocol changes: - Added CANCEL_JOB (type 5) and WAIT_FOR_RESULT (type 6) request types - Added CANCELLED (5) to JobStatus enum - Added CancelResponse message to protobuf schema Server changes: - Added cancelled flag to shared memory JobQueueEntry - Workers check cancelled flag before processing claimed jobs - Cancel handler kills worker process for running jobs - wait_for_result() uses JobWaiter condition variables - Fixed check_job_status() to detect PROCESSING state - Thread-per-connection model (already present) enables blocking waits Serialization: - Added cancel methods to remote_serialization.hpp base class - Implemented in both protobuf and msgpack serializers - Added MSG_WAIT_REQUEST (17) to msgpack message types Documentation: - Added docs/remote_solve_architecture.md for security review - Covers protocol, process model, data flow, security considerations

C++ API: - Added cancel_job_remote() function to remote_solve.hpp/cu - Added remote_job_status_t enum and cancel_job_result_t struct - Simplified cuopt_cancel_job utility to use the new API (194→105 lines) Python API: - Added cuopt.linear_programming.remote module - cancel_job() function for cancelling remote jobs - JobStatus enum and CancelResult dataclass - Exported from cuopt.linear_programming package All three APIs tested and verified working: - Raw protobuf cancel (baseline) - C++ utility cancel (cuopt_cancel_job) - Python API cancel (cancel_job)

Server-side: - All jobs (blocking and non-blocking) now go through the shared memory job queue and are processed by worker processes - For blocking=true requests, server submits to queue then waits on condition variable until completion, then auto-deletes and returns - This enables cancellation for 'sync' jobs and allows concurrent solves - Single code path for all job processing Client-side: - Updated solve_lp_remote and solve_mip_remote to use the unified response format (result_response) for blocking mode - Removed streaming log dependency for blocking mode Benefits: - Cancellation works for all jobs (blocking and non-blocking) - Multiple concurrent 'sync' solves possible (one per worker) - Simpler code with single processing path - Worker isolation (one job failure doesn't crash server)

- Rename protobuf OptimizationProblem fields to match data_model_view_t: * constraint_matrix_values -> A_values (matches A_) * constraint_matrix_indices -> A_indices (matches A_indices_) * constraint_matrix_offsets -> A_offsets (matches A_offsets_) * objective_coefficients -> c (matches c_) * constraint_bounds -> b (matches b_) * is_integer/is_binary -> variable_types (bytes, matches variable_types_) - Add missing fields to protobuf and msgpack serialization: * initial_primal_solution - for warm start initial primal values * initial_dual_solution - for warm start initial dual values * Q_values, Q_indices, Q_offsets - quadratic objective matrix for QPS - Update msgpack serializer to use identical field names as protobuf - Update protobuf_serializer.cu serialization/deserialization - Both serializers now handle all data_model_view_t fields

- Rename A_values -> A (matches data_model_view_t::A_) - Remove unused problem_category field and enum - Rename PDLPSolverMode enum values to match pdlp_solver_mode_t: PDLP_MODE_STABLE1 -> Stable1, etc. - Rename LPMethod enum values to match method_t: METHOD_CONCURRENT -> Concurrent, METHOD_BARRIER -> Barrier, etc. - Update protobuf_serializer.cu for new enum names - Update msgpack_serializer.cpp for A field rename

- Remove redundant n_rows, n_cols, nnz fields (sizes inferred from arrays) - Add variable_names and row_names fields (matching protobuf) - Msgpack and protobuf now use identical field names for OptimizationProblem

Shared Memory Changes: - Replace fixed 64MB/128MB embedded data arrays with per-job shared memory - JobQueueEntry and ResultQueueEntry now store shm_data_name instead of data[] - Create /cuopt_job_<id> and /cuopt_result_<id> segments on demand - Cleanup segments after job completion or on error - Change data_size from uint32_t to uint64_t for >4GB support Wire Protocol Changes: - Change message size from uint32_t (4GB max) to uint64_t - Update both server (cuopt_remote_server.cpp) and client (remote_solve.cu) - Increase sanity check limit from 100MB to 16GB Benefits: - No fixed size limits - problems limited only by system RAM - No wasted memory for small problems - Queue entries are now ~256 bytes instead of 64MB+ - Supports very large LP/MIP problems

- Add is_mip parameter to serialize_result_response to properly distinguish LP vs MIP solutions (protobuf parsing is lenient and was misinterpreting MIP results as LP) - Add get_job_is_mip helper function to retrieve job type from tracker - Change job_id fields from string to bytes in protobuf schema to avoid spurious UTF-8 validation warnings when parsing wrong message types - Update all serialize_result_response call sites to pass is_mip flag Both LP and MIP remote solve now work correctly with Python Problem class.

Pipes are container-friendly (no /dev/shm size limits or volume mounts needed). Shared memory may be slightly faster for very large problems. Changes: - Add WorkerPipes struct with pipe file descriptors - Add worker_index, data_sent fields to JobQueueEntry for pipe sync - Add worker_index field to ResultQueueEntry for pipe mode - Add pending_job_data map for storing data before worker claims job - Add pipe I/O helpers: write_to_pipe, read_from_pipe, send_job_data_pipe, recv_job_data_pipe, send_result_pipe, recv_result_pipe - Modify spawn_workers to create pipes before forking - Modify spawn_single_worker to recreate pipes when restarting workers - Modify worker_process to read job data from pipe or shm based on mode - Modify result_retrieval_thread to send job data to workers (pipe mode) and read results from pipes - Add --use-shm command line option - Display IPC mode in server startup message Tested: Both pipe mode (default) and shm mode work correctly with LP and MIP remote solve via Python Problem class.

When a running job is cancelled, the worker is killed with SIGKILL which closes its pipe ends. If the server is in the middle of writing to the worker's input pipe, it receives SIGPIPE which would terminate the server. By ignoring SIGPIPE, the write() call returns -1 with errno=EPIPE instead, which is handled gracefully (logged as 'Failed to send job data'). Tested: Both pipe mode and shm mode work correctly with job cancellation and worker restart.

These scripts are for manual testing during development and won't be run by CI (pytest looks in specific directories like python/cuopt/). - test_pipe_cancel_restart.py: Tests job cancellation and worker restart - test_python_problem_remote.py: Tests Python Problem class with remote solve

Move protobuf::libprotobuf from PUBLIC to PRIVATE linking for both cuopt and cuopt_remote_server libraries. This fixes Python wheel build failures where protobuf was incorrectly exposed as a public dependency.

Remove unnecessary data_sent flag wait loop that caused 30s timeout. The pipe read operation naturally blocks until data is available, so explicit polling of the data_sent flag is redundant and can cause the worker to timeout before the server writes data. This fixes the 'Timeout waiting for job data' error when using pipe-based IPC with large problems. Bug: Worker would timeout after 30 seconds waiting for data_sent flag Fix: Worker now directly reads from pipe (blocking read) Result: Data transfer happens as soon as server writes to pipe

Document all three operating modes (Sync, Async, WAIT_FOR_RESULT) with: - Detailed workflow descriptions and use cases - Log retrieval methods for each mode - Complete API endpoint reference - Python usage examples and best practices - Architecture diagrams for all modes - Future API design recommendations This consolidates the remote solve protocol documentation into a single reference for users and developers.

- Fix send_typed_message to write 8-byte payload size (was incorrectly writing 4 bytes) - Fix pipe mode deadlock: set result.ready=true BEFORE writing to pipe (prevents deadlock when result size exceeds pipe buffer ~64KB) - Add 2-minute poll() timeout in read_from_pipe() for safety if worker dies - Add #include <poll.h> for poll() function

tmckayus · 2026-01-13T21:10:55Z

/ok to test afd25ed

- Add libprotobuf to conda recipe build and host requirements - Add libprotobuf to dependencies.yaml build_cpp section - Regenerate conda environment files with libprotobuf Fixes CI build failure where protobuf library was not found during CMake configuration for remote solve server implementation.

tmckayus · 2026-01-13T21:31:13Z

/ok to test 28ef558

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @conda/environments/all_cuda-130_arch-x86_64.yaml.orig:
- Around line 1-82: The file all_cuda-130_arch-x86_64.yaml.orig should be
deleted from the commit because it represents an unsupported CUDA 13.0
environment and is missing the libprotobuf dependency; remove the file entirely
from the branch (git rm --cached/--force as appropriate) and commit that
deletion, then verify no other generated env files reference CUDA 13.0 or lack
libprotobuf so the supported CUDA 12.9 and 13.1 variants remain intact.

In @conda/recipes/libcuopt/recipe.yaml:
- Line 65: The run requirements for the libcuopt package are missing
libprotobuf, causing runtime link failures because cpp/CMakeLists.txt links
libcuopt.so against protobuf::libprotobuf; update the recipe.yaml run section
for the libcuopt output (the run list around the current entries pinned at the
run block) to include libprotobuf alongside libboost-devel and librmm (i.e.,
insert a run entry "libprotobuf" in the run dependencies list so consumers of
libcuopt get protobuf at runtime).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between afd25ed and 28ef558.

📒 Files selected for processing (7)

conda/environments/all_cuda-129_arch-aarch64.yaml
conda/environments/all_cuda-129_arch-x86_64.yaml
conda/environments/all_cuda-130_arch-x86_64.yaml.orig
conda/environments/all_cuda-131_arch-aarch64.yaml
conda/environments/all_cuda-131_arch-x86_64.yaml
conda/recipes/libcuopt/recipe.yaml
dependencies.yaml

🔇 Additional comments (5)

conda/environments/all_cuda-131_arch-aarch64.yaml (1)

35-35: LGTM - libprotobuf dependency correctly added.

The addition aligns with the dependencies.yaml source file change and supports the new protobuf-based remote solve functionality.

conda/environments/all_cuda-129_arch-x86_64.yaml (1)

35-35: LGTM - Consistent with other environment files.

conda/environments/all_cuda-131_arch-x86_64.yaml (1)

35-35: LGTM - Consistent with other environment files.

dependencies.yaml (1)

296-305: LGTM - libprotobuf correctly added to build_cpp dependencies.

The placement under build_cpp.common with output_types: conda is appropriate for the C++ build dependency needed for protobuf-based remote serialization. This follows the existing pattern of other build dependencies in this section.

conda/environments/all_cuda-129_arch-aarch64.yaml (1)

35-35: LGTM!

The libprotobuf dependency addition is consistent with the PR's protobuf-based remote solve serialization requirements. The alphabetical placement is correct.

coderabbitai · 2026-01-13T21:33:38Z

conda/environments/all_cuda-130_arch-x86_64.yaml.orig

+# This file is generated by `rapids-dependency-file-generator`.
+# To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.
+channels:
+- rapidsai-nightly
+- rapidsai
+- conda-forge
+dependencies:
+- boost
+- breathe
+- bzip2
+- c-compiler
+- ccache
+- clang-tools=20.1.4
+- clang==20.1.4
+- cmake>=3.30.4
+- cpp-argparse
+- cuda-nvcc
+- cuda-nvtx-dev
+- cuda-python>=13.0.1,<14.0a0
+- cuda-sanitizer-api
+- cuda-version=13.0
+- cudf==26.2.*,>=0.0.0a0
+- cupy>=13.6.0
+- cxx-compiler
+- cython>=3.0.3
+- doxygen=1.9.1
+- exhale
+- fastapi
+- gcc_linux-64=14.*
+- gmock
+- gtest
+- ipython
+- jsonref==1.1.0
+- libcudss-dev >=0.7
+- libcurand-dev
+- libcusolver-dev
+- libcusparse-dev
+- libraft-headers==26.2.*,>=0.0.0a0
+- librmm==26.2.*,>=0.0.0a0
+- make
+- msgpack-numpy==0.4.8
+- msgpack-python==1.1.0
+- myst-nb
+- myst-parser
+- ninja
+- notebook
+- numba-cuda>=0.19.1,<0.20.0a0
+- numba>=0.60.0
+- numpy>=1.23.5,<3.0a0
+- numpydoc
+- pandas>=2.0
+- pexpect
+- pip
+- pre-commit
+- psutil>=6.0.0
+- pylibraft==26.2.*,>=0.0.0a0
+- pyrsistent
+- pytest-cov
+- pytest<8
+- python>=3.10,<3.14
+- rapids-build-backend>=0.4.0,<0.5.0.dev0
+- rapids-dask-dependency==26.2.*,>=0.0.0a0
+- rapids-logger==0.2.*,>=0.0.0a0
+- requests
+- rmm==26.2.*,>=0.0.0a0
+- scikit-build-core>=0.10.0
+- sphinx
+- sphinx-copybutton
+- sphinx-design
+- sphinx-markdown-tables
+- sphinx_rtd_theme
+- sphinxcontrib-openapi
+- sphinxcontrib-websupport
+- sysroot_linux-64==2.28
+- tbb-devel
+- uvicorn==0.34.*
+- zlib
+- pip:
+  - nvidia_sphinx_theme
+  - swagger-plugin-for-sphinx
+  - veroviz
+name: all_cuda-130_arch-x86_64


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if the non-.orig version exists echo "=== Checking for existing all_cuda-130_arch-x86_64.yaml ===" fd -t f "all_cuda-130_arch-x86_64.yaml" conda/environments/ echo "" echo "=== Checking libprotobuf presence in all environment files ===" for f in conda/environments/all_cuda-*.yaml; do if [ -f "$f" ]; then echo -n "$f: " grep -q "libprotobuf" "$f" && echo "HAS libprotobuf" || echo "MISSING libprotobuf" fi done echo "" echo "=== Listing any .orig files that may be artifacts ===" fd -e orig . conda/environments/ echo "" echo "=== Checking git status and history for .orig files ===" cd conda/environments && git status --short | grep orig || echo "No .orig files in git status"

Repository: NVIDIA/cuopt

Length of output: 694

🏁 Script executed:

#!/bin/bash # Check if .orig file is tracked in git echo "=== Checking if .orig file is tracked in git ===" git ls-files | grep "all_cuda-130_arch-x86_64.yaml.orig" echo "" echo "=== Checking git log for this file ===" git log --oneline -- conda/environments/all_cuda-130_arch-x86_64.yaml.orig | head -5 echo "" echo "=== List all all_cuda*.yaml* files to see naming pattern ===" ls -la conda/environments/all_cuda*.yaml*

Repository: NVIDIA/cuopt

Length of output: 804

🏁 Script executed:

#!/bin/bash echo "=== Check all environment files in the directory ===" ls -1 conda/environments/all_cuda*.yaml* | sort echo "" echo "=== Check what's in the git commit that added libprotobuf ===" git show --stat 28ef558 | head -50 echo "" echo "=== Check the full diff for the .orig file in that commit ===" git show 28ef558 -- conda/environments/all_cuda-130_arch-x86_64.yaml.orig | head -100

Repository: NVIDIA/cuopt

Length of output: 5691

🏁 Script executed:

#!/bin/bash echo "=== Check if dependencies.yaml mentions CUDA 130 ===" grep -A 10 "130" dependencies.yaml | head -30 echo "" echo "=== Count how many env variants are defined for each CUDA version in dependencies.yaml ===" grep -E "cuda-130|cuda-129|cuda-131" dependencies.yaml | wc -l

Repository: NVIDIA/cuopt

Length of output: 202

🏁 Script executed:

#!/bin/bash echo "=== Search for CUDA version patterns in dependencies.yaml ===" grep -i "cuda" dependencies.yaml | head -20 echo "" echo "=== Check the structure of dependencies.yaml ===" head -100 dependencies.yaml echo "" echo "=== Look for arch and cuda variants ===" grep -E "arch|cuda" dependencies.yaml | grep -i "x86_64\|aarch64\|130\|129\|131" | head -20

Repository: NVIDIA/cuopt

Length of output: 3306

🏁 Script executed:

#!/bin/bash echo "=== Check git history for the .orig file ===" git log --all --oneline -- conda/environments/all_cuda-130_arch-x86_64.yaml.orig echo "" echo "=== Check if there are any references to CUDA 130 in the codebase ===" rg "130" conda/ --type yaml | head -20 echo "" echo "=== Verify that CUDA 129 and 131 both have both arch variants ===" echo "CUDA 129:" ls -1 conda/environments/all_cuda-129* | sort echo "" echo "CUDA 131:" ls -1 conda/environments/all_cuda-131* | sort

Repository: NVIDIA/cuopt

Length of output: 504

Delete the all_cuda-130_arch-x86_64.yaml.orig file—it should not be committed.

This file represents an unsupported CUDA version. The dependencies.yaml only defines CUDA versions 12.9 and 13.1, which generate environment files for CUDA 129 and 131. CUDA 130 is not in the supported matrix and has no corresponding aarch64 variant (unlike the 129 and 131 versions which both have x86_64 and aarch64 pairs).

Additionally, this file is missing the libprotobuf dependency that was added across all other environment files in this PR. It appears to be a leftover from the file regeneration process and should be removed entirely rather than renamed.

🤖 Prompt for AI Agents

In @conda/environments/all_cuda-130_arch-x86_64.yaml.orig around lines 1 - 82, The file all_cuda-130_arch-x86_64.yaml.orig should be deleted from the commit because it represents an unsupported CUDA 13.0 environment and is missing the libprotobuf dependency; remove the file entirely from the branch (git rm --cached/--force as appropriate) and commit that deletion, then verify no other generated env files reference CUDA 13.0 or lack libprotobuf so the supported CUDA 12.9 and 13.1 variants remain intact.

coderabbitai · 2026-01-13T21:33:38Z

conda/recipes/libcuopt/recipe.yaml

      - tbb-devel
      - zlib
      - bzip2
+      - libprotobuf


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find the recipe.yaml file and check its structure find . -name "recipe.yaml" -path "*libcuopt*" 2>/dev/null

Repository: NVIDIA/cuopt

Length of output: 92

🏁 Script executed:

# Look for CMakeLists.txt files that might contain protobuf references find . -name "CMakeLists.txt" 2>/dev/null | head -20

Repository: NVIDIA/cuopt

Length of output: 799

🏁 Script executed:

# Search for any protobuf references in the repository rg -l "protobuf" --type yaml --type cmake 2>/dev/null | head -20

Repository: NVIDIA/cuopt

Length of output: 367

🏁 Script executed:

# Read the recipe.yaml file to see the structure and all requirements cat -n ./conda/recipes/libcuopt/recipe.yaml

Repository: NVIDIA/cuopt

Length of output: 8706

🏁 Script executed:

# Check cpp/CMakeLists.txt for protobuf linking details rg -n "protobuf" ./cpp/CMakeLists.txt -A 3 -B 1

Repository: NVIDIA/cuopt

Length of output: 380

🏁 Script executed:

# Also check the main source CMakeLists for protobuf rg -n "protobuf" ./cpp/src/CMakeLists.txt -A 3 -B 1

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Search for how protobuf is found/configured in CMakeLists rg -n "find_package.*protobuf" ./cpp/CMakeLists.txt -i -A 5

Repository: NVIDIA/cuopt

Length of output: 342

🏁 Script executed:

# Check if there's any mention of static protobuf linking rg -n "protobuf.*STATIC\|Protobuf_USE_STATIC_LIBS\|Protobuf.*static" ./cpp/CMakeLists.txt -i

Repository: NVIDIA/cuopt

Length of output: 38

🏁 Script executed:

# Look for the full context of the protobuf link in libcuopt rg -n "target_link_libraries.*cuopt" ./cpp/CMakeLists.txt -B 10 -A 10 | head -60

Repository: NVIDIA/cuopt

Length of output: 1934

libprotobuf missing from libcuopt run requirements — add at line 161.

While libprotobuf is correctly added to build and host requirements, it must also be in the run requirements for the libcuopt output package. The CMakeLists.txt dynamically links libcuopt.so against protobuf::libprotobuf (line 370 in cpp/CMakeLists.txt), which means consumers of libcuopt require protobuf at runtime. Without this dependency, binaries and libraries linking to libcuopt.so will fail with missing symbol errors.

Add libprotobuf to the run requirements at line 161:

run: - ${{ pin_compatible("cuda-version", upper_bound="x", lower_bound="x") }} - ${{ pin_subpackage("libmps-parser", exact=True) }} - libboost-devel - libprotobuf - librmm =${{ minor_version }}

🤖 Prompt for AI Agents

In @conda/recipes/libcuopt/recipe.yaml at line 65, The run requirements for the libcuopt package are missing libprotobuf, causing runtime link failures because cpp/CMakeLists.txt links libcuopt.so against protobuf::libprotobuf; update the recipe.yaml run section for the libcuopt output (the run list around the current entries pinned at the run block) to include libprotobuf alongside libboost-devel and librmm (i.e., insert a run entry "libprotobuf" in the run dependencies list so consumers of libcuopt get protobuf at runtime).

- Add depends_on_protobuf section to dependencies.yaml - Include protobuf in py_rapids_build_libcuopt requirements - Regenerate python/libcuopt/pyproject.toml with protobuf Fixes Python wheel CI build failure where CMake could not find protobuf during the libcuopt wheel build. The previous commit added protobuf to conda builds, this commit adds it to pip wheel builds.

tmckayus · 2026-01-13T21:57:17Z

/ok to test 3b4e741

tmckayus · 2026-01-13T22:55:41Z

TODO: take another pass and replace client protocol with gRPC

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@ci/utils/install_protobuf.sh`:
- Around line 9-25: Pin Protobuf to a compatible version across all places:
update dependencies.yaml to require a specific Protobuf range (e.g. "protobuf:
>=3.20,<4.0"), update CMakeLists.txt's find_package invocation to include the
minimum version (e.g. change find_package(Protobuf REQUIRED) to
find_package(Protobuf 3.20 REQUIRED)), and change this install script
(install_protobuf.sh) to install that pinned version via the package manager
(adjust the dnf/apt package names or add version qualifiers / use apt-get
install protobuf-compiler=3.20.* or the equivalent distro-specific
package/version) and also ensure conda environment specs are updated to the same
version range; make sure error branches still instruct manual install if the
exact version is unavailable.

🧹 Nitpick comments (2)

ci/utils/install_protobuf.sh (2)
14-17: Consider cleaning up apt cache to reduce image size.

In CI/container environments, cleaning up the apt cache after installation reduces the final image size.
Suggested improvement
     elif [[ "$ID" == "ubuntu" ]]; then
         echo "Detected Ubuntu. Installing Protobuf via apt..."
         apt-get update
-        apt-get install -y libprotobuf-dev protobuf-compiler
+        apt-get install -y --no-install-recommends libprotobuf-dev protobuf-compiler
+        rm -rf /var/lib/apt/lists/*
     else
11-21: Consider adding Debian support.

Since Debian is closely related to Ubuntu and uses the same package manager, you might want to support it explicitly. This would help if the CI environment or developer machines use Debian.
Suggested improvement
     if [[ "$ID" == "rocky" ]]; then
         echo "Detected Rocky Linux. Installing Protobuf via dnf..."
         dnf install -y protobuf-devel protobuf-compiler
-    elif [[ "$ID" == "ubuntu" ]]; then
-        echo "Detected Ubuntu. Installing Protobuf via apt..."
+    elif [[ "$ID" == "ubuntu" || "$ID" == "debian" ]]; then
+        echo "Detected $ID. Installing Protobuf via apt..."
         apt-get update
         apt-get install -y libprotobuf-dev protobuf-compiler
     else

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b4e741 and 0dcd54a.

📒 Files selected for processing (2)

ci/build_wheel_libcuopt.sh
ci/utils/install_protobuf.sh

🔇 Additional comments (2)

ci/utils/install_protobuf.sh (1)

1-6: Good shell script setup.

The strict mode (set -euo pipefail) ensures the script fails fast on errors, which is appropriate for CI scripts.

ci/build_wheel_libcuopt.sh (1)

20-22: LGTM!

The protobuf installation step is correctly placed in the dependency installation sequence, following the established pattern used by other dependencies (Boost, TBB, cuDSS). The script's set -euo pipefail ensures any installation failure will abort the build.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T16:01:17Z

ci/utils/install_protobuf.sh

+if [ -f /etc/os-release ]; then
+    . /etc/os-release
+    if [[ "$ID" == "rocky" ]]; then
+        echo "Detected Rocky Linux. Installing Protobuf via dnf..."
+        dnf install -y protobuf-devel protobuf-compiler
+    elif [[ "$ID" == "ubuntu" ]]; then
+        echo "Detected Ubuntu. Installing Protobuf via apt..."
+        apt-get update
+        apt-get install -y libprotobuf-dev protobuf-compiler
+    else
+        echo "Unknown OS: $ID. Please install Protobuf development libraries manually."
+        exit 1
+    fi
+else
+    echo "/etc/os-release not found. Cannot determine OS. Please install Protobuf development libraries manually."
+    exit 1
+fi


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if there are version constraints specified elsewhere in the project echo "=== Checking dependency files for protobuf version constraints ===" rg "protobuf" --type yaml --type txt -g '!*.pb.*' -n echo "" echo "=== Checking CMake files for protobuf version requirements ===" rg "find_package.*[Pp]rotobuf" -A2 -B2 -n echo "" echo "=== Searching for CMakeLists.txt files ===" fd "CMakeLists.txt" -x cat -n {} echo "" echo "=== Checking for .proto files ===" fd "\.proto$" -x cat -n {} echo "" echo "=== Checking requirements.txt or similar Python dependency files ===" fd -E node_modules "(requirements.*\.txt|setup\.py|pyproject\.toml|poetry\.lock|Pipfile)" -x cat -n {} echo "" echo "=== Checking for docker or container files ===" fd "(Dockerfile|\.dockerignore)" -x cat -n {}

Repository: NVIDIA/cuopt

Length of output: 50369

🏁 Script executed:

#!/bin/bash echo "=== Checking dependencies.yaml for protobuf version constraints ===" rg -A5 "depends_on_protobuf|libprotobuf" dependencies.yaml echo "" echo "=== Checking conda environment files for libprotobuf version specs ===" fd "all_cuda.*\.yaml" conda/environments/ -x grep -A1 -B1 "libprotobuf" {} echo "" echo "=== Checking conda recipe for libprotobuf ===" grep -A2 -B2 "libprotobuf" conda/recipes/libcuopt/recipe.yaml echo "" echo "=== Checking if protobuf has version specifications anywhere ===" rg "protobuf.*==" -n

Repository: NVIDIA/cuopt

Length of output: 1147

🏁 Script executed:

#!/bin/bash echo "=== Looking for .proto file to understand protobuf usage ===" find . -name "*.proto" -type f -exec cat -n {} + echo "" echo "=== Checking what's in linear_programming/utilities/protobuf_serializer.cu ===" fd "protobuf_serializer" -x head -30 {}

Repository: NVIDIA/cuopt

Length of output: 14312

Add protobuf version constraints to ensure compatibility across distributions.

Protobuf version mismatches between build and runtime environments can cause serialization/deserialization issues. Currently, protobuf has no version pinning in dependencies.yaml, conda environments, or CMakeLists.txt (find_package(Protobuf REQUIRED) lacks version specification), while the install script installs without version constraints. Given that this PR introduces protobuf-based serialization for remote solving, add explicit version constraints to dependencies.yaml, CMakeLists.txt, and update the install script to match the pinned version.

🤖 Prompt for AI Agents

In `@ci/utils/install_protobuf.sh` around lines 9 - 25, Pin Protobuf to a compatible version across all places: update dependencies.yaml to require a specific Protobuf range (e.g. "protobuf: >=3.20,<4.0"), update CMakeLists.txt's find_package invocation to include the minimum version (e.g. change find_package(Protobuf REQUIRED) to find_package(Protobuf 3.20 REQUIRED)), and change this install script (install_protobuf.sh) to install that pinned version via the package manager (adjust the dnf/apt package names or add version qualifiers / use apt-get install protobuf-compiler=3.20.* or the equivalent distro-specific package/version) and also ensure conda environment specs are updated to the same version range; make sure error branches still instruct manual install if the exact version is unavailable.

rgsl888prabhu · 2026-01-14T16:13:25Z

/ok to test 0dcd54a

rgsl888prabhu · 2026-01-14T19:29:07Z

/ok to test de8daf7

…e3attempt

rgsl888prabhu · 2026-01-14T21:07:54Z

/ok to test 7660217

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@cpp/CMakeLists.txt`:
- Around line 621-624: Wrap the
add_subdirectory(src/linear_programming/utilities/serializers) call with the
BUILD_REMOTE_SERVER conditional so the serializers plugin is only built when
BUILD_REMOTE_SERVER is enabled; locate the existing add_subdirectory invocation
in CMakeLists and move or surround it with the if(BUILD_REMOTE_SERVER) ...
endif() block (keeping it near other remote-server related entries and
references such as cuopt_remote_server.cpp) to prevent building the msgpack
serializer when remote server support is disabled.

🧹 Nitpick comments (2)

cpp/CMakeLists.txt (2)
234-238: Prefer target-specific includes over global include_directories().

Using global include_directories() adds paths to all targets in this directory, which can cause unintended include pollution. Modern CMake best practice is to use target_include_directories() on specific targets that need the paths.

Since multiple targets need the generated proto headers (cuopt, cuopt_remote_server), consider either:

Adding the paths explicitly to each target that needs them, or

Creating an interface library for the proto headers
♻️ Suggested approach using target-specific includes
 # Protocol Buffers for remote solve serialization
 find_package(Protobuf REQUIRED)
-include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS})
-include_directories(${CMAKE_CURRENT_BINARY_DIR})
+# Protobuf includes added to targets via target_include_directories below
Then add to the cuopt target (around line 305-319):
target_include_directories(cuopt SYSTEM PRIVATE ${Protobuf_INCLUDE_DIRS})
target_include_directories(cuopt PRIVATE ${CMAKE_CURRENT_BINARY_DIR})
And similarly for cuopt_remote_server.
575-581: Consider using PRIVATE for executable link dependencies.

For executable targets, the PUBLIC/PRIVATE distinction on target_link_libraries has no practical effect on the executable itself. Using PRIVATE is the conventional choice and more clearly expresses intent.
♻️ Suggested change
   target_link_libraries(cuopt_remote_server
-    PUBLIC
-    cuopt
-    OpenMP::OpenMP_CXX
     PRIVATE
+    cuopt
+    OpenMP::OpenMP_CXX
     protobuf::libprotobuf
   )
   target_link_libraries(cuopt_cancel_job
-    PUBLIC
-    cuopt
+    PRIVATE cuopt
   )
Also applies to: 609-612

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a2227b and 7660217.

📒 Files selected for processing (1)

cpp/CMakeLists.txt

🧰 Additional context used

🧠 Learnings (11)

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

cpp/CMakeLists.txt

🔇 Additional comments (3)

cpp/CMakeLists.txt (3)

142-147: LGTM!

Reasonable workaround for GCC's known false-positive -Warray-bounds diagnostics from external headers like Protobuf, while preserving -Werror for other warnings.

239-253: Proto generation setup looks correct.

The custom command is properly configured with DEPENDS for rebuild tracking and VERBATIM for safe argument handling.

Minor observation: PROTO_HDRS is defined but unused. Consider adding it to the cuopt target's sources or target_sources() to ensure proper dependency tracking for the header file, though in practice the .cc file inclusion typically suffices.

264-267: LGTM!

The protobuf sources are correctly added to the cuopt library, and the linking arrangement is appropriate:

mps_parser_static embeds the parser symbols directly

protobuf::libprotobuf as PRIVATE keeps it an implementation detail

Also applies to: 375-377

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T21:11:03Z

cpp/CMakeLists.txt

+# ##################################################################################################
+# - Pluggable Serializers --------------------------------------------------------------------------
+# Build optional serializer plugins (e.g., msgpack)
+add_subdirectory(src/linear_programming/utilities/serializers)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Check if serializers are used outside of remote solve code paths # Find references to serializer in the codebase excluding the serializers directory itself echo "=== References to serializer outside serializers directory ===" rg -n "serializer" --type cpp --type cmake -g '!**/serializers/**' -g '!**/test**' | head -100 echo "" echo "=== Check if serializers CMakeLists.txt has its own guards ===" fd -t f "CMakeLists.txt" --path "*/serializers/*" -exec cat {} \; echo "" echo "=== Check the context around line 621-624 in CMakeLists.txt ===" rg -B 30 -A 5 "add_subdirectory.*serializers" cpp/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 7539

🏁 Script executed:

#!/bin/bash # Get serializers CMakeLists.txt echo "=== Serializers CMakeLists.txt ===" find . -path "*/serializers/CMakeLists.txt" -exec cat {} \; echo "" echo "=== Check BUILD_REMOTE_SERVER conditional structure in main CMakeLists.txt ===" # Search for BUILD_REMOTE_SERVER conditions to see the structure rg -n "BUILD_REMOTE_SERVER|add_subdirectory.*serializers" cpp/CMakeLists.txt | tail -50 echo "" echo "=== Check protobuf_serializer.cu location and context ===" fd -t f "protobuf_serializer.cu" -exec head -20 {} \; echo "" echo "=== Check if protobuf_serializer is conditioned in src/linear_programming/CMakeLists.txt ===" rg -B 5 -A 5 "protobuf_serializer" cpp/src/linear_programming/CMakeLists.txt

Repository: NVIDIA/cuopt

Length of output: 2658

Move the serializers subdirectory inside the BUILD_REMOTE_SERVER conditional.

The add_subdirectory(src/linear_programming/utilities/serializers) call builds the msgpack serializer plugin, which is only used by the remote server. All serializer references in the codebase are confined to cuopt_remote_server.cpp. Building this optional plugin when remote server support is disabled is wasteful and should be gated behind BUILD_REMOTE_SERVER.

🤖 Prompt for AI Agents

In `@cpp/CMakeLists.txt` around lines 621 - 624, Wrap the add_subdirectory(src/linear_programming/utilities/serializers) call with the BUILD_REMOTE_SERVER conditional so the serializers plugin is only built when BUILD_REMOTE_SERVER is enabled; locate the existing add_subdirectory invocation in CMakeLists and move or surround it with the if(BUILD_REMOTE_SERVER) ... endif() block (keeping it near other remote-server related entries and references such as cuopt_remote_server.cpp) to prevent building the msgpack serializer when remote server support is disabled.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@cpp/CMakeLists.txt`:
- Around line 266-269: The cuopt library is currently built with ${PROTO_SRCS}
always appended which breaks builds when Protobuf is disabled; ensure PROTO_SRCS
is defined to an empty value when the protobuf/remote-server branch is not taken
(e.g., set(PROTO_SRCS "" ) or guard with if(NOT DEFINED PROTO_SRCS)
set(PROTO_SRCS "")) so the add_library(cuopt SHARED ${CUOPT_SRC_FILES}
${PROTO_SRCS}) line works unconditionally; update the CMake logic that defines
PROTO_SRCS (the conditional that generates/sets it) to provide this default.
- Around line 236-255: Wrap the Protobuf dependency and proto generation in an
if(BUILD_REMOTE_SERVER) ... endif() guard so find_package(Protobuf REQUIRED),
the add_custom_command that produces PROTO_SRCS/PROTO_HDRS, and setting
PROTO_FILE/PROTO_SRCS/PROTO_HDRS only run when BUILD_REMOTE_SERVER is ON; remove
the unconditional include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS}) and
include_directories(${CMAKE_CURRENT_BINARY_DIR}) and instead call
target_include_directories(<the target that consumes PROTO_SRCS/PROTO_HDRS>
PRIVATE ${CMAKE_CURRENT_BINARY_DIR} SYSTEM ${Protobuf_INCLUDE_DIRS}) so only the
target that uses the generated headers gets the includes.

♻️ Duplicate comments (1)

cpp/CMakeLists.txt (1)
623-626: Move the serializers subdirectory inside the BUILD_REMOTE_SERVER conditional.

This was flagged in a previous review. The add_subdirectory(src/linear_programming/utilities/serializers) builds the msgpack serializer plugin, which is only used by the remote server.
♻️ Suggested fix
-# ##################################################################################################
-# - Pluggable Serializers --------------------------------------------------------------------------
-# Build optional serializer plugins (e.g., msgpack)
-add_subdirectory(src/linear_programming/utilities/serializers)
Move this inside the if(BUILD_REMOTE_SERVER AND NOT BUILD_LP_ONLY) block above (before line 621).

🧹 Nitpick comments (2)

cpp/CMakeLists.txt (2)
376-379: Protobuf linkage should be conditional.

The protobuf::libprotobuf is linked unconditionally. If the Protobuf dependency is made conditional on BUILD_REMOTE_SERVER, this linkage should also be guarded:
♻️ Suggested refactor
   PRIVATE
   ${CUOPT_PRIVATE_CUDA_LIBS}
   cuopt::mps_parser_static  # Static link - symbols embedded in libcuopt.so
-  protobuf::libprotobuf
+  $<$<BOOL:${BUILD_REMOTE_SERVER}>:protobuf::libprotobuf>
   )
559-621: Remote server build section is well-structured.

The BUILD_REMOTE_SERVER guard and RUNPATH configuration are appropriate. Minor suggestions:

Redundant include directories (lines 567-576): For executables, PUBLIC include directories are not meaningful since nothing links against them. The same paths appear in both PRIVATE and PUBLIC.

Executable linkage (lines 577-583, 611-614): For executables, PUBLIC vs PRIVATE linkage is semantically equivalent. Consider using only PRIVATE for clarity.
♻️ Optional cleanup
   target_include_directories(cuopt_remote_server
     PRIVATE
     "${CMAKE_CURRENT_SOURCE_DIR}/src"
     "${CMAKE_CURRENT_SOURCE_DIR}/include"
     "${CMAKE_CURRENT_SOURCE_DIR}/libmps_parser/include"
     "${CMAKE_CURRENT_BINARY_DIR}"
-    PUBLIC
     "$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>"
     "$<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}/include>"
   )
   target_link_libraries(cuopt_remote_server
-    PUBLIC
-    cuopt
-    OpenMP::OpenMP_CXX
     PRIVATE
+    cuopt
+    OpenMP::OpenMP_CXX
     protobuf::libprotobuf
   )

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7660217 and 235323c.

📒 Files selected for processing (1)

cpp/CMakeLists.txt

🧰 Additional context used

🧠 Learnings (12)

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cpp,hpp,h} : Prevent thread-unsafe use of global and static variables; use proper mutex/synchronization in server code accessing shared solver state

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure race conditions are absent in multi-GPU code and multi-threaded server implementations; verify proper synchronization of shared state

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify error propagation from CUDA to user-facing APIs is complete; ensure CUDA errors are caught and mapped to meaningful user error codes

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cpp,hpp,h} : Avoid inappropriate use of exceptions in performance-critical GPU operation paths; prefer error codes or CUDA error checking for latency-sensitive code

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh} : Avoid reinventing functionality already available in Thrust, CCCL, or RMM libraries; prefer standard library utilities over custom implementations

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.cu : Detect inefficient GPU kernel launches with low occupancy or poor memory access patterns; optimize for coalesced memory access and minimize warp divergence in hot paths

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Track GPU device memory allocations and deallocations to prevent memory leaks; ensure cudaMalloc/cudaFree balance and cleanup of streams/events

Applied to files:

cpp/CMakeLists.txt

📚 Learning: 2025-11-25T10:20:49.822Z

Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : For concurrent CUDA operations (barriers, async operations), explicitly create and manage dedicated streams instead of reusing the default stream; document stream lifecycle

Applied to files:

cpp/CMakeLists.txt

🔇 Additional comments (1)

cpp/CMakeLists.txt (1)

142-149: LGTM: Pragmatic warning suppression for third-party headers.

The GCC-specific suppression of -Warray-bounds and -Wstringop-overread is a reasonable approach to avoid false positives from Protobuf headers while maintaining strict warnings for project code.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-14T23:06:24Z

cpp/CMakeLists.txt

+# Protocol Buffers for remote solve serialization
+find_package(Protobuf REQUIRED)
+include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS})
+include_directories(${CMAKE_CURRENT_BINARY_DIR})
+
+# Generate C++ code from .proto file
+set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto")
+set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc")
+set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h")
+
+add_custom_command(
+  OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}"
+  COMMAND ${Protobuf_PROTOC_EXECUTABLE}
+  ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}
+       --proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities
+       ${PROTO_FILE}
+  DEPENDS ${PROTO_FILE}
+  COMMENT "Generating C++ code from cuopt_remote.proto"
+  VERBATIM
+)


⚠️ Potential issue | 🟠 Major

Protobuf is unconditionally required even when remote server is disabled.

The find_package(Protobuf REQUIRED) and proto generation occur unconditionally, adding a mandatory build dependency for all users even when BUILD_REMOTE_SERVER=OFF. This increases build complexity and external dependencies for users who only need local solves.

Consider making Protobuf conditional on BUILD_REMOTE_SERVER:

♻️ Suggested refactor

-# Protocol Buffers for remote solve serialization -find_package(Protobuf REQUIRED) -include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS}) -include_directories(${CMAKE_CURRENT_BINARY_DIR}) - -# Generate C++ code from .proto file -set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto") -set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc") -set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h") - -add_custom_command( - OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}" - COMMAND ${Protobuf_PROTOC_EXECUTABLE} - ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} - --proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities - ${PROTO_FILE} - DEPENDS ${PROTO_FILE} - COMMENT "Generating C++ code from cuopt_remote.proto" - VERBATIM -) +# Protocol Buffers for remote solve serialization (optional) +set(PROTO_SRCS "") +if(BUILD_REMOTE_SERVER) + find_package(Protobuf REQUIRED) + + set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto") + set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc") + set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h") + + add_custom_command( + OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}" + COMMAND ${Protobuf_PROTOC_EXECUTABLE} + ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR} + --proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities + ${PROTO_FILE} + DEPENDS ${PROTO_FILE} + COMMENT "Generating C++ code from cuopt_remote.proto" + VERBATIM + ) +endif()

Additionally, prefer target_include_directories() over global include_directories() for better encapsulation.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Protocol Buffers for remote solve serialization

find_package(Protobuf REQUIRED)

include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS})

include_directories(${CMAKE_CURRENT_BINARY_DIR})

# Generate C++ code from .proto file

set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto")

set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc")

set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h")

add_custom_command(

OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}"

COMMAND ${Protobuf_PROTOC_EXECUTABLE}

ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}

--proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities

${PROTO_FILE}

DEPENDS ${PROTO_FILE}

COMMENT "Generating C++ code from cuopt_remote.proto"

VERBATIM

)

# Protocol Buffers for remote solve serialization (optional)

set(PROTO_SRCS "")

if(BUILD_REMOTE_SERVER)

find_package(Protobuf REQUIRED)

set(PROTO_FILE "${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities/cuopt_remote.proto")

set(PROTO_SRCS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.cc")

set(PROTO_HDRS "${CMAKE_CURRENT_BINARY_DIR}/cuopt_remote.pb.h")

add_custom_command(

OUTPUT "${PROTO_SRCS}" "${PROTO_HDRS}"

COMMAND ${Protobuf_PROTOC_EXECUTABLE}

ARGS --cpp_out ${CMAKE_CURRENT_BINARY_DIR}

--proto_path ${CMAKE_CURRENT_SOURCE_DIR}/src/linear_programming/utilities

${PROTO_FILE}

DEPENDS ${PROTO_FILE}

COMMENT "Generating C++ code from cuopt_remote.proto"

VERBATIM

)

endif()

🤖 Prompt for AI Agents

In `@cpp/CMakeLists.txt` around lines 236 - 255, Wrap the Protobuf dependency and proto generation in an if(BUILD_REMOTE_SERVER) ... endif() guard so find_package(Protobuf REQUIRED), the add_custom_command that produces PROTO_SRCS/PROTO_HDRS, and setting PROTO_FILE/PROTO_SRCS/PROTO_HDRS only run when BUILD_REMOTE_SERVER is ON; remove the unconditional include_directories(SYSTEM ${Protobuf_INCLUDE_DIRS}) and include_directories(${CMAKE_CURRENT_BINARY_DIR}) and instead call target_include_directories(<the target that consumes PROTO_SRCS/PROTO_HDRS> PRIVATE ${CMAKE_CURRENT_BINARY_DIR} SYSTEM ${Protobuf_INCLUDE_DIRS}) so only the target that uses the generated headers gets the includes.

coderabbitai · 2026-01-14T23:06:24Z

cpp/CMakeLists.txt

 add_library(cuopt SHARED
  ${CUOPT_SRC_FILES}
+  ${PROTO_SRCS}
 )


⚠️ Potential issue | 🟠 Major

Proto sources compiled into main library unconditionally.

The ${PROTO_SRCS} is added to the cuopt library regardless of whether remote server support is enabled. If Protobuf setup is made conditional per the previous comment, ensure PROTO_SRCS defaults to empty so this line works correctly in both cases.

🤖 Prompt for AI Agents

In `@cpp/CMakeLists.txt` around lines 266 - 269, The cuopt library is currently built with ${PROTO_SRCS} always appended which breaks builds when Protobuf is disabled; ensure PROTO_SRCS is defined to an empty value when the protobuf/remote-server branch is not taken (e.g., set(PROTO_SRCS "" ) or guard with if(NOT DEFINED PROTO_SRCS) set(PROTO_SRCS "")) so the add_library(cuopt SHARED ${CUOPT_SRC_FILES} ${PROTO_SRCS}) line works unconditionally; update the CMake logic that defines PROTO_SRCS (the conditional that generates/sets it) to provide this default.

tmckayus · 2026-01-26T15:04:54Z

closing, superseded by #799

tmckayus added 30 commits January 13, 2026 15:11

Align msgpack field names with protobuf

2982a03

- Remove redundant n_rows, n_cols, nnz fields (sizes inferred from arrays) - Add variable_names and row_names fields (matching protobuf) - Msgpack and protobuf now use identical field names for OptimizationProblem

Fix protobuf linking scope in CMakeLists.txt

5574e74

Move protobuf::libprotobuf from PUBLIC to PRIVATE linking for both cuopt and cuopt_remote_server libraries. This fixes Python wheel build failures where protobuf was incorrectly exposed as a public dependency.

tmckayus added feature request New feature or request non-breaking Introduces a non-breaking change P0 Feature labels Jan 13, 2026

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

Add protbuf installation

0dcd54a

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

update

1a2227b

tmckayus mentioned this pull request Jan 14, 2026

WIP: remote execution #587

Closed

Merge branch 'main' into remote3attempt

de8daf7

rgsl888prabhu added 2 commits January 14, 2026 15:06

fix warning failure

6fda942

Merge branch 'remote3attempt' of github.com:tmckayus/cuopt into remot…

7660217

…e3attempt

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

fix cmake

235323c

coderabbitai bot reviewed Jan 14, 2026

View reviewed changes

anandhkb added this to the 26.02 milestone Jan 16, 2026

rgsl888prabhu changed the base branch from main to release/26.02 January 22, 2026 16:44

tmckayus mentioned this pull request Jan 26, 2026

WIP: Splitting PR (Add gRPC server and remote execution integrated in cuOpt solver) #799

Closed

tmckayus closed this Jan 26, 2026

tmckayus deleted the remote3attempt branch February 5, 2026 14:38

Conversation

tmckayus commented Jan 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Summary by CodeRabbit

Uh oh!

tmckayus commented Jan 13, 2026

Uh oh!

tmckayus commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

tmckayus commented Jan 13, 2026

Uh oh!

tmckayus commented Jan 13, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

rgsl888prabhu commented Jan 14, 2026

Uh oh!

rgsl888prabhu commented Jan 14, 2026

Uh oh!

rgsl888prabhu commented Jan 14, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

tmckayus commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tmckayus commented Jan 13, 2026 •

edited by coderabbitai bot

Loading