Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
79d5157
Change solution to be on host from c++ side
rgsl888prabhu Nov 7, 2025
1ab34ab
fix style
rgsl888prabhu Nov 7, 2025
2160bef
fix
rgsl888prabhu Nov 7, 2025
9ca1325
fix cudf issue
rgsl888prabhu Nov 10, 2025
999a549
fix typo
rgsl888prabhu Nov 10, 2025
a72b785
enable few tests
rgsl888prabhu Nov 10, 2025
1ea0702
remove unsued file
rgsl888prabhu Nov 10, 2025
1a5edbf
Merge branch 'fix_cudf_break' of github.com:rgsl888prabhu/cuopt_publi…
rgsl888prabhu Nov 10, 2025
e735ee1
Merge branch 'main' of github.com:nvidia/cuopt into segfault_issue
rgsl888prabhu Nov 11, 2025
b9d5ea0
Update test_wheel_cuopt.sh
rgsl888prabhu Nov 12, 2025
4a43461
Merge branch 'main' into segfault_issue
rgsl888prabhu Nov 12, 2025
0f4b100
Update test_lp.py
rgsl888prabhu Nov 12, 2025
7511a06
Update utils_wrapper.pyx
rgsl888prabhu Nov 12, 2025
324b412
Update test_lp.py
rgsl888prabhu Nov 12, 2025
816cd21
Merge branch 'release/25.12' into segfault_issue
rgsl888prabhu Nov 17, 2025
582004d
Merge branch 'release/25.12' into segfault_issue
rgsl888prabhu Nov 18, 2025
5e62f50
Merge branch 'release/25.12' into segfault_issue
rgsl888prabhu Nov 19, 2025
d64f387
Merge branch 'release/25.12' into segfault_issue
rgsl888prabhu Nov 20, 2025
2880a7c
Merge branch 'main' into segfault_issue
rgsl888prabhu Jan 5, 2026
6e9f901
Adding best practices and details (#692)
rgsl888prabhu Jan 5, 2026
6632c56
Merge branch 'segfault_issue' of github.com:rgsl888prabhu/cuopt_publi…
rgsl888prabhu Jan 5, 2026
0e66899
fix copyright
rgsl888prabhu Jan 5, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 156 additions & 0 deletions .github/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# AGENTS.md - AI Coding Agent Guidelines for cuOpt

> This file provides essential context for AI coding assistants (Codex, Cursor, GitHub Copilot, etc.) working with the NVIDIA cuOpt codebase.

> **For setup, building, testing, and contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md).**

---

## Project Overview

**cuOpt** is NVIDIA's GPU-accelerated optimization engine for:
- **Mixed Integer Linear Programming (MILP)**
- **Linear Programming (LP)**
- **Quadratic Programming (QP)**
- **Vehicle Routing Problems (VRP)** including TSP and PDP

### Architecture

```
cuopt/
├── cpp/ # Core C++ engine (libcuopt, libmps_parser)
│ ├── include/cuopt/ # Public C/C++ headers
│ ├── src/ # Implementation (CUDA kernels, algorithms)
│ └── tests/ # C++ unit tests (gtest)
├── python/
│ ├── cuopt/ # Python bindings and routing API
│ ├── cuopt_server/ # REST API server
│ ├── cuopt_self_hosted/ # Self-hosted deployment utilities
│ └── libcuopt/ # Python wrapper for C library
├── ci/ # CI/CD scripts and Docker configurations
├── conda/ # Conda recipes and environment files
├── docs/ # Documentation source
├── datasets/ # Test datasets for LP, MIP, routing
└── notebooks/ # Example Jupyter notebooks
```

### Supported APIs

| API Type | LP | MILP | QP | Routing |
|----------|:--:|:----:|:--:|:-------:|
| C API | ✓ | ✓ | ✓ | ✗ |
| C++ API | ✓ | ✓ | ✓ | ✓ |
| Python | ✓ | ✓ | ✓ | ✓ |
| Server | ✓ | ✓ | ✗ | ✓ |

---

## Coding Style and Conventions

### C++ Naming Conventions

- **Base style**: `snake_case` for all names (except test cases: PascalCase)
- **Prefixes/Suffixes**:
- `d_` → device data variables (e.g., `d_locations_`)
- `h_` → host data variables (e.g., `h_data_`)
- `_t` → template type parameters (e.g., `i_t`, `value_t`)
- `_` → private member variables (e.g., `n_locations_`)

```cpp
// Example naming pattern
template <typename i_t>
class locations_t {
private:
i_t n_locations_{};
i_t* d_locations_{}; // device pointer
i_t* h_locations_{}; // host pointer
};
```

### File Extensions

| Extension | Usage |
|-----------|-------|
| `.hpp` | C++ headers |
| `.cpp` | C++ source |
| `.cu` | CUDA C++ source (nvcc required) |
| `.cuh` | CUDA headers with device code |

### Include Order

1. Local headers
2. RAPIDS headers
3. Related libraries
4. Dependencies
5. STL

### Python Style

- Follow PEP 8
- Use type hints where applicable
- Tests use `pytest` framework

### Formatting

- **C++**: Enforced by `clang-format` (config: `cpp/.clang-format`)
- **Python**: Enforced via pre-commit hooks
- See [CONTRIBUTING.md](../CONTRIBUTING.md) for pre-commit setup

---

## Error Handling Patterns

### Runtime Assertions

```cpp
// Use CUOPT_EXPECTS for runtime checks
CUOPT_EXPECTS(lhs.type() == rhs.type(), "Column type mismatch");

// Use CUOPT_FAIL for unreachable code paths
CUOPT_FAIL("This code path should not be reached.");
```

### CUDA Error Checking

```cpp
// Always wrap CUDA calls
RAFT_CUDA_TRY(cudaMemcpy(&dst, &src, num_bytes));
```

---

## Memory Management Guidelines

- **Never use raw `new`/`delete`** - Use RMM allocators
- **Prefer `rmm::device_uvector<T>`** for device memory
- **All operations should be stream-ordered** - Accept `cuda_stream_view`
- **Views (`*_view` suffix) are non-owning** - Don't manage their lifetime

---

## Key Files Reference

| Purpose | Location |
|---------|----------|
| Main build script | `build.sh` |
| Dependencies | `dependencies.yaml` |
| C++ formatting | `cpp/.clang-format` |
| Conda environments | `conda/environments/` |
| Test data download | `datasets/get_test_data.sh` |
| CI configuration | `ci/` |
| Version info | `VERSION` |

---

## Common Pitfalls

| Problem | Solution |
|---------|----------|
| Cython changes not reflected | Rerun: `./build.sh cuopt` |
| Missing `nvcc` | Set `$CUDACXX` or add CUDA to `$PATH` |
| CUDA out of memory | Reduce problem size or use streaming |
| Slow debug library loading | Device symbols cause delay; use selectively |

---

*For detailed setup, build instructions, testing workflows, debugging, and contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md).*
1 change: 1 addition & 0 deletions .github/CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This project has adopted the [Contributor Covenant Code of Conduct](https://docs.rapids.ai/resources/conduct/).
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: Bug report
name: 🐛 Bug report
about: Create a bug report to help us improve cuOpt
title: "[BUG]"
labels: "? - Needs Triage, bug"
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/documentation-request.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: Documentation request
name: 📚 Documentation request
about: Report incorrect or needed documentation
title: "[DOC]"
labels: "? - Needs Triage, doc"
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: Feature request
name: 🚀 Feature request
about: Suggest an idea for cuOpt
title: "[FEA]"
labels: "? - Needs Triage, feature request"
Expand Down
2 changes: 1 addition & 1 deletion .github/ISSUE_TEMPLATE/submit-question.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: Submit question
name: Submit question
about: Ask a general question about cuOpt
title: "[QST]"
labels: "? - Needs Triage, question"
Expand Down
15 changes: 15 additions & 0 deletions .github/SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Security
---------
NVIDIA is dedicated to the security and trust of our software products and services, including all source code repositories managed through our organization.

If you need to report a security issue, please use the appropriate contact points outlined below. Please do not report security vulnerabilities through GitHub/GitLab.

Reporting Potential Security Vulnerability in NVIDIA cuOpt
----------------------------------------------------------
To report a potential security vulnerability in NVIDIA cuOpt:

- Web: [Security Vulnerability Submission Form](https://www.nvidia.com/object/submit-security-vulnerability.html)
- E-Mail: [[email protected]](mailto:[email protected])
- We encourage you to use the following PGP key for secure email communication: [NVIDIA public PGP Key for communication](https://www.nvidia.com/en-us/security/pgp-key)
- Please include the following information:
- Product/Driver name and version/branch that contains the vulnerability
Comment on lines +14 to +15
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix markdown list indentation.

Line 15 has incorrect indentation for a nested list item. Markdown expects 2 spaces of indentation for sub-items, not 4.

🔎 Proposed fix for list indentation
 - Please include the following information:
-    - Product/Driver name and version/branch that contains the vulnerability
+  - Product/Driver name and version/branch that contains the vulnerability
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Please include the following information:
- Product/Driver name and version/branch that contains the vulnerability
- Please include the following information:
- Product/Driver name and version/branch that contains the vulnerability
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

15-15: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🤖 Prompt for AI Agents
In @.github/SECURITY.md around lines 14-15, The nested list item "-
Product/Driver name and version/branch that contains the vulnerability" is
over-indented; change its indentation from four spaces to two spaces so it is a
proper sub-item under "Please include the following information:" (i.e., align
the hyphen two spaces in from the parent list line) to fix the Markdown list
nesting.

19 changes: 9 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,15 @@
# cuOpt - GPU accelerated Optimization Engine

[![Build Status](https://github.com/NVIDIA/cuopt/actions/workflows/build.yaml/badge.svg)](https://github.com/NVIDIA/cuopt/actions/workflows/build.yaml)
[![Version](https://img.shields.io/badge/version-26.02.00-blue)](https://github.com/NVIDIA/cuopt/releases)
[![Documentation](https://img.shields.io/badge/docs-latest-brightgreen)](https://docs.nvidia.com/cuopt/user-guide/latest/introduction.html)
[![Docker Hub](https://img.shields.io/badge/docker-nvidia%2Fcuopt-blue?logo=docker)](https://hub.docker.com/r/nvidia/cuopt)
[![Examples](https://img.shields.io/badge/examples-cuopt--examples-orange)](https://github.com/NVIDIA/cuopt-examples)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NVIDIA/cuopt-examples/blob/cuopt_examples_launcher/cuopt_examples_launcher.ipynb)
[![NVIDIA Launchable](https://img.shields.io/badge/NVIDIA-Launchable-76b900?logo=nvidia)](https://brev.nvidia.com/launchable/deploy?launchableID=env-2qIG6yjGKDtdMSjXHcuZX12mDNJ)
[![Videos and Tutorials](https://img.shields.io/badge/Videos_and_Tutorials-red?logo=youtube)](https://docs.nvidia.com/cuopt/user-guide/latest/resources.html#cuopt-examples-and-tutorials-videos)



NVIDIA® cuOpt™ is a GPU-accelerated optimization engine that excels in mixed integer linear programming (MILP), linear programming (LP), and vehicle routing problems (VRP). It enables near real-time solutions for large-scale challenges with millions of variables and constraints, offering
easy integration into existing solvers and seamless deployment across hybrid and multi-cloud environments.
Expand Down Expand Up @@ -146,13 +155,3 @@ For current release timelines and dates, refer to the [RAPIDS Maintainers Docs](
## Contributing Guide

Review the [CONTRIBUTING.md](CONTRIBUTING.md) file for information on how to contribute code and issues to the project.

## Resources

- [libcuopt (C) documentation](https://docs.nvidia.com/cuopt/user-guide/latest/cuopt-c/index.html)
- [cuopt (Python) documentation](https://docs.nvidia.com/cuopt/user-guide/latest/cuopt-python/index.html)
- [cuopt (Server) documentation](https://docs.nvidia.com/cuopt/user-guide/latest/cuopt-server/index.html)
- [Examples and Notebooks](https://github.com/NVIDIA/cuopt-examples)
- [Test cuopt with NVIDIA Launchable](https://brev.nvidia.com/launchable/deploy?launchableID=env-2qIG6yjGKDtdMSjXHcuZX12mDNJ): Examples notebooks are pulled and hosted on [NVIDIA Launchable](https://docs.nvidia.com/brev/latest/).
- [Test cuopt on Google Colab](https://colab.research.google.com/github/nvidia/cuopt-examples/): Examples notebooks can be opened in Google Colab. Please note that you need to choose a `Runtime` as `GPU` in order to run the notebooks.
- [cuOpt Examples and Tutorial Videos](https://docs.nvidia.com/cuopt/user-guide/latest/resources.html#cuopt-examples-and-tutorials-videos)
5 changes: 4 additions & 1 deletion ci/release/update-version.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#!/bin/bash

# SPDX-FileCopyrightText: Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-FileCopyrightText: Copyright (c) 2022-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0

## Usage
Expand Down Expand Up @@ -131,6 +131,9 @@ done
PROJECT_FILE="docs/cuopt/source/project.json"
sed_runner 's/\("version": "\)[0-9][0-9]\.[0-9][0-9]\.[0-9][0-9]"/\1'${NEXT_FULL_TAG}'"/g' "${PROJECT_FILE}"

# Update README.md version badge
sed_runner 's/badge\/version-[0-9]\+\.[0-9]\+\.[0-9]\+-blue/badge\/version-'${NEXT_FULL_TAG}'-blue/g' README.md

# Update nightly
sed_runner 's/'"cuopt_version: \"[0-9][0-9].[0-9][0-9]\""'/'"cuopt_version: \"${NEXT_SHORT_TAG}\""'/g' .github/workflows/nightly.yaml

Expand Down
28 changes: 14 additions & 14 deletions cpp/include/cuopt/linear_programming/utilities/cython_solve.hpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* clang-format off */
/*
* SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: Apache-2.0
*/
/* clang-format on */
Expand All @@ -25,19 +25,19 @@ namespace cython {
// aggregate for call_solve() return type
// to be exposed to cython:
struct linear_programming_ret_t {
std::unique_ptr<rmm::device_buffer> primal_solution_;
std::unique_ptr<rmm::device_buffer> dual_solution_;
std::unique_ptr<rmm::device_buffer> reduced_cost_;
std::vector<double> primal_solution_;
std::vector<double> dual_solution_;
std::vector<double> reduced_cost_;
/* -- PDLP Warm Start Data -- */
std::unique_ptr<rmm::device_buffer> current_primal_solution_;
std::unique_ptr<rmm::device_buffer> current_dual_solution_;
std::unique_ptr<rmm::device_buffer> initial_primal_average_;
std::unique_ptr<rmm::device_buffer> initial_dual_average_;
std::unique_ptr<rmm::device_buffer> current_ATY_;
std::unique_ptr<rmm::device_buffer> sum_primal_solutions_;
std::unique_ptr<rmm::device_buffer> sum_dual_solutions_;
std::unique_ptr<rmm::device_buffer> last_restart_duality_gap_primal_solution_;
std::unique_ptr<rmm::device_buffer> last_restart_duality_gap_dual_solution_;
std::vector<double> current_primal_solution_;
std::vector<double> current_dual_solution_;
std::vector<double> initial_primal_average_;
std::vector<double> initial_dual_average_;
std::vector<double> current_ATY_;
std::vector<double> sum_primal_solutions_;
std::vector<double> sum_dual_solutions_;
std::vector<double> last_restart_duality_gap_primal_solution_;
std::vector<double> last_restart_duality_gap_dual_solution_;
double initial_primal_weight_;
double initial_step_size_;
int total_pdlp_iterations_;
Expand All @@ -64,7 +64,7 @@ struct linear_programming_ret_t {
};

struct mip_ret_t {
std::unique_ptr<rmm::device_buffer> solution_;
std::vector<double> solution_;

linear_programming::mip_termination_status_t termination_status_;
error_type_t error_status_;
Expand Down
41 changes: 18 additions & 23 deletions cpp/src/linear_programming/utilities/cython_solve.cu
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/* clang-format off */
/*
* SPDX-FileCopyrightText: Copyright (c) 2023-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-FileCopyrightText: Copyright (c) 2023-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
* SPDX-License-Identifier: Apache-2.0
*/
/* clang-format on */
Expand Down Expand Up @@ -142,28 +142,21 @@ linear_programming_ret_t call_solve_lp(
const bool use_pdlp_solver_mode = true;
auto solution = cuopt::linear_programming::solve_lp(
op_problem, solver_settings, problem_checking, use_pdlp_solver_mode, is_batch_mode);

// Convert device vectors to host vectors for LP solution
linear_programming_ret_t lp_ret{
std::make_unique<rmm::device_buffer>(solution.get_primal_solution().release()),
std::make_unique<rmm::device_buffer>(solution.get_dual_solution().release()),
std::make_unique<rmm::device_buffer>(solution.get_reduced_cost().release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().current_primal_solution_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().current_dual_solution_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().initial_primal_average_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().initial_dual_average_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().current_ATY_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().sum_primal_solutions_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().sum_dual_solutions_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_.release()),
std::make_unique<rmm::device_buffer>(
solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_.release()),
cuopt::host_copy(solution.get_primal_solution()),
cuopt::host_copy(solution.get_dual_solution()),
cuopt::host_copy(solution.get_reduced_cost()),
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_primal_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_dual_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_primal_average_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_dual_average_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_ATY_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_primal_solutions_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_dual_solutions_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_),
Comment on lines +148 to +159
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find host_copy implementation
find . -name "*.h" -o -name "*.hpp" -o -name "*.cu" -o -name "*.cpp" | head -20
echo "---"
# Search for host_copy definition
rg -nP 'inline.*host_copy|^[[:space:]]*host_copy.*{' --type=cpp --type=h -A 10 | head -50

Repository: NVIDIA/cuopt

Length of output: 2174


🏁 Script executed:

#!/bin/bash
# Get full context of the file around lines 148-159
wc -l cpp/src/linear_programming/utilities/cython_solve.cu

Repository: NVIDIA/cuopt

Length of output: 112


🏁 Script executed:

#!/bin/bash
# Show context around lines 148-159
sed -n '130,175p' cpp/src/linear_programming/utilities/cython_solve.cu

Repository: NVIDIA/cuopt

Length of output: 2788


🏁 Script executed:

#!/bin/bash
# Search for host_copy in include files
find cpp/include -name "*.hpp" -o -name "*.h" | xargs grep -l "host_copy"

Repository: NVIDIA/cuopt

Length of output: 40


🏁 Script executed:

#!/bin/bash
# Check for CUDA error checking patterns used in the file
rg -n 'CUDA_CHECK|RAFT_CUDA_TRY|cudaGetLastError|RAFT_CHECK_CUDA|host_copy' cpp/src/linear_programming/utilities/cython_solve.cu | head -30

Repository: NVIDIA/cuopt

Length of output: 1252


🏁 Script executed:

#!/bin/bash
# Get full host_copy implementation
cat -n cpp/src/utilities/copy_helpers.hpp | head -200

Repository: NVIDIA/cuopt

Length of output: 6185


🏁 Script executed:

#!/bin/bash
# Check if host_copy uses error checking internally or is wrapped with error checks
rg -n 'host_copy' cpp/src/utilities/copy_helpers.hpp -B 5 -A 15 | head -100

Repository: NVIDIA/cuopt

Length of output: 3206


🏁 Script executed:

#!/bin/bash
# Look for the default host_copy overload (non-bool version)
rg -n 'template.*host_copy|^[[:space:]]*host_copy.*\{' cpp/src/utilities/copy_helpers.hpp -A 15

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Check if there's error handling around the solve call
sed -n '140,180p' cpp/src/linear_programming/utilities/cython_solve.cu

Repository: NVIDIA/cuopt

Length of output: 2594


🏁 Script executed:

#!/bin/bash
# Check for async/batched copy patterns in similar code
rg -n 'cudaMemcpyAsync|rmm::device_uvector|thrust::copy' cpp/src/linear_programming/utilities/ --type=cu --type=cpp -C 3 | head -60

Repository: NVIDIA/cuopt

Length of output: 82


🏁 Script executed:

#!/bin/bash
# Check if raft::copy has error checking
rg -n 'raft::copy' cpp/include/cuopt/ --type=cpp --type=h | head -5

Repository: NVIDIA/cuopt

Length of output: 38


🏁 Script executed:

#!/bin/bash
# Verify if this is end-of-solve (not hot path) - check what comes after line 180
sed -n '160,180p' cpp/src/linear_programming/utilities/cython_solve.cu

Repository: NVIDIA/cuopt

Length of output: 1281


🏁 Script executed:

#!/bin/bash
# Check if raft library provides CUDA error checking
find . -type f -name "*.hpp" -o -name "*.h" | xargs grep -l "raft::copy" | head -3

Repository: NVIDIA/cuopt

Length of output: 166


🏁 Script executed:

#!/bin/bash
# Check if there are any CUDA_CHECK or error handlers wrapping these calls
sed -n '130,185p' cpp/src/linear_programming/utilities/cython_solve.cu | grep -i 'cuda\|check\|try\|error'

Repository: NVIDIA/cuopt

Length of output: 311


🏁 Script executed:

#!/bin/bash
# Verify the exact context - is this function expected to return immediately after these copies?
sed -n '175,200p' cpp/src/linear_programming/utilities/cython_solve.cu

Repository: NVIDIA/cuopt

Length of output: 1177


Consider batching device-to-host transfers to reduce redundant synchronizations.

The host_copy function includes an implicit stream_view.synchronize() (line 123 of copy_helpers.hpp). With 12 sequential host_copy calls here, this creates 12 synchronization barriers—one per call—instead of a single barrier. While raft::copy provides built-in CUDA error checking, consolidating the copies into a batched operation with one synchronization at the end would be more efficient:

// Instead of 12 separate host_copy calls with 12 syncs each,
// batch the raft::copy calls first, then synchronize once
🤖 Prompt for AI Agents
In @cpp/src/linear_programming/utilities/cython_solve.cu around lines 148-159,
Multiple sequential calls to cuopt::host_copy (which invokes
stream_view.synchronize() per call) cause repeated synchronization barriers;
instead, aggregate the device->host transfers for the PDLP warm start fields and
other solution arrays by issuing raft::copy (or equivalent device->host memcpy)
into host buffers without synchronizing each time, then call a single
stream/synchronization once after all copies; update the call sites around the
list of cuopt::host_copy(...) for solution.get_primal_solution(),
get_dual_solution(), get_reduced_cost(), and all
solution.get_pdlp_warm_start_data() members to use batched copies and one final
synchronize (referencing host_copy, cuopt::host_copy, raft::copy, and
get_pdlp_warm_start_data() to locate the code).

solution.get_pdlp_warm_start_data().initial_primal_weight_,
Comment on lines +151 to 160
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Skip warm-start host copies when we’re in batch mode.

With this change every PDLP warm-start vector is eagerly host_copy’d, even when is_batch_mode is true. Batch solves never consume those fields (see create_solution(..., is_batch=True)), so we now pay multiple device→host transfers per instance for no benefit. For large LPs that’s a significant regression compared to the old device_buffer path, where batch executions left the data on device.

Wrap these initializers so the copies only occur when !is_batch_mode, e.g.:

@@
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().current_primal_solution_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().current_dual_solution_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_primal_average_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_dual_average_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().current_ATY_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_primal_solutions_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_dual_solutions_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_),
-    cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().current_primal_solution_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().current_dual_solution_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_primal_average_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_dual_average_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().current_ATY_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_primal_solutions_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_dual_solutions_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_),
+    is_batch_mode ? std::vector<double>{}
+                  : cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_),

(or compute the vectors above the initializer and reuse them). That preserves existing semantics while avoiding unnecessary transfers in the batch path.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_primal_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_dual_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_primal_average_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_dual_average_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().current_ATY_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_primal_solutions_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_dual_solutions_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_),
cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_),
solution.get_pdlp_warm_start_data().initial_primal_weight_,
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().current_primal_solution_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().current_dual_solution_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_primal_average_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().initial_dual_average_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().current_ATY_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_primal_solutions_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().sum_dual_solutions_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_primal_solution_),
is_batch_mode ? std::vector<double>{}
: cuopt::host_copy(solution.get_pdlp_warm_start_data().last_restart_duality_gap_dual_solution_),
solution.get_pdlp_warm_start_data().initial_primal_weight_,
🤖 Prompt for AI Agents
In cpp/src/linear_programming/utilities/cython_solve.cu around lines 140 to 149,
the PDLP warm-start vectors are unconditionally copied to host causing
unnecessary device→host transfers in batch mode; change the initializer to only
perform cuopt::host_copy calls when !is_batch_mode (or compute the host_copy
results into local variables above the initializer and reuse them), so that when
is_batch_mode is true the original device_buffer path/data is preserved and no
host copies are performed; ensure all referenced warm-start fields are
conditionally set so semantics for non-batch solves remain unchanged.

solution.get_pdlp_warm_start_data().initial_step_size_,
solution.get_pdlp_warm_start_data().total_pdlp_iterations_,
Expand Down Expand Up @@ -205,7 +198,9 @@ mip_ret_t call_solve_mip(
error_type_t::ValidationError,
"MIP solve cannot be called on an LP problem!");
auto solution = cuopt::linear_programming::solve_mip(op_problem, solver_settings);
mip_ret_t mip_ret{std::make_unique<rmm::device_buffer>(solution.get_solution().release()),

// Convert device vector to host vector for MILP solution
mip_ret_t mip_ret{cuopt::host_copy(solution.get_solution()),
solution.get_termination_status(),
solution.get_error_status().get_error_type(),
solution.get_error_status().what(),
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: Copyright (c) 2022-2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. # noqa
# SPDX-FileCopyrightText: Copyright (c) 2022-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. # noqa
# SPDX-License-Identifier: Apache-2.0

# cython: profile=False
Expand Down Expand Up @@ -26,6 +26,8 @@ from cuopt.utilities import series_from_buf

import pyarrow as pa

import pyarrow as pa


cdef class WaypointMatrix:

Expand Down
Loading