Add support for FP32 and mixed precision in PDLP by Kh4ster · Pull Request #910 · NVIDIA/cuopt

Kh4ster · 2026-02-27T14:08:47Z

This PR adds support for FP32 and mixed precision in PDLP (not MIP, Dual Simplex or Barrier).

Those two new options are available through:

FP32: creating the whole solver object as FP32 or passing "--pdlp-fp32" to the solve_LP binary
Mixed precision: toggling the option "mixed_precision_spmv" in solver_settings or passing "--mixed-precision-spmv" to the solve_LP binary

Below, the detail of what each feature allows:

FP32 Precision Mode

By default, PDLP operates in FP64 (double) precision. Users can switch to FP32 (float) precision for the entire solve. FP32 uses half the memory of FP64 and allows PDHG iterations to be on average twice as fast, but it may require more iterations to converge due to reduced numerical accuracy. FP32 mode is only supported with the PDLP method (not concurrent) and without crossover.

Note: The default precision is FP64 (double).

Mixed Precision SpMV

When running PDLP in FP64 mode, users can enable mixed precision sparse matrix-vector products (SpMV) during PDHG iterations. In this mode, the constraint matrix and its transpose are stored in FP32 while vectors and the compute type remain in FP64. This allows SpMV operations to be faster thanks to reduced memory bandwidth requirements, while maintaining FP64 accuracy in the accumulation. This will make PDHG iterations faster while limiting the potential negative impact on convergence (compared to running in FP32 mode). Convergence checking and restart logic always use the full FP64 matrix, so this mode does not reduce memory usage since both the FP32 and FP64 copies of the matrix are kept in memory. Mixed precision SpMV only applies in FP64 mode and has no effect when running in FP32.

Note: The default value is false.

Summary by CodeRabbit

New Features
- FP32 precision mode for the PDLP solver (new command-line option)
- Mixed-precision SpMV optimization for FP64 solves (new command-line option)
- Runtime selection between FP32/FP64 and optional mixed-precision SpMV
Compatibility & Notes
- FP32 mode is PDLP-only and incompatible with crossover, concurrent, or batch modes
- Mixed-precision SpMV applies only in FP64 mode and has memory/performance trade-offs; disabled in unsupported modes
Documentation
- Added user guidance for FP32 precision and mixed-precision SpMV, including compatibility and performance notes

copy-pr-bot · 2026-02-27T14:08:51Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-27T14:20:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 366fd6a and abf13f3.

📒 Files selected for processing (1)

cpp/src/math_optimization/solver_settings.cu

📝 Walkthrough

Walkthrough

Adds FP32 PDLP support and an optional mixed-precision SpMV path. Core PDLP/PDHG and utility components are templated over a floating type f_t, new CLI flags control precision and mixed-SpMV, FP32 matrix copies and SpMV helpers are added, and many template instantiation guards were extended to enable float builds.

Changes

Cohort / File(s)	Summary
Benchmark / CLI `benchmarks/linear_programming/cuopt/run_pdlp.cu`	Added `--pdlp-fp32` and `--mixed-precision-spmv` flags; templated solver settings and introduced `run_solver<f_t>` with runtime precision dispatch.
Solver Settings & Config `cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp`, `cpp/src/pdlp/solver_settings.cu`, `cpp/src/math_optimization/solver_settings.cu`	Templated time_limit/tolerances to `f_t`, added `mixed_precision_spmv` flag, converted numeric literals to `f_t(...)`.
PDLP Core & PDHG `cpp/src/pdlp/pdlp.cu`, `cpp/src/pdlp/pdhg.hpp`, `cpp/src/pdlp/pdhg.cu`, `cpp/src/pdlp/solve.cu`	Added `enable_mixed_precision_spmv` ctor param, routed mixed-precision SpMV into PDHG compute paths, added templated cublasGeam wrapper, and adjusted float-path validations (no crossover/non-PDLP methods for float).
cusparse_view / Mixed-SpMV Implementation `cpp/src/pdlp/cusparse_view.hpp`, `cpp/src/pdlp/cusparse_view.cu`	Added FP32 copies/descriptors and buffers, `mixed_precision_spmv` helpers and optional preprocess (CUDA12.4+), `update_mixed_precision_matrices()`, and new constructor parameter to enable the feature.
Template Instantiation Guards multiple files under `cpp/src/pdlp/...`, `cpp/src/mip_heuristics/...` (`*` many files listed)	Broadened float instantiation guards from `MIP_INSTANTIATE_FLOAT` to `MIP_INSTANTIATE_FLOAT
Sparse / Matrix Utilities & Conversions `cpp/src/dual_simplex/sparse_matrix.cpp`, `cpp/src/mip_heuristics/presolve/third_party_presolve.cpp`, `cpp/src/pdlp/optimization_problem.cu`	Added float explicit instantiations; introduced `convert_vector<To,From>` and f_t↔double conversions for PSLP/Papilo integration; adapted presolve/undo flows for f_t.
Solution I/O & Conversion `cpp/src/math_optimization/solution_writer.hpp`, `cpp/src/math_optimization/solution_writer.cu`, `cpp/src/pdlp/solution_conversion.cu`, `cpp/src/mip_heuristics/solver_solution.cu`	Templated `write_solution_to_sol_file<f_t>`; added CPU-buffer conversion helpers; added explicit float/double instantiations, file-open safety, and precision-aware output formatting.
Utilities, Strategies & Restart Components `cpp/src/mip_heuristics/...`, `cpp/src/pdlp/...` (various files)	Cast constants and arithmetic to `f_t`, converted time/fixrate calculations to `f_t`, and updated numerous instantiation guards to include `PDLP_INSTANTIATE_FLOAT`.
Tests `cpp/tests/linear_programming/pdlp_test.cu`	Templated `is_incorrect_objective<f_t>`, added float test variants and FP32 PDLP/mixed-precision test cases; added `mip_constants.hpp` include.
Docs `docs/cuopt/source/lp-qp-features.rst`, `docs/cuopt/source/lp-qp-milp-settings.rst`	Documented FP32 PDLP mode and Mixed Precision SpMV option, constraints, and defaults.
Build Constants `cpp/src/mip_heuristics/mip_constants.hpp`	Added `PDLP_INSTANTIATE_FLOAT` macro and bumped header year.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 1.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and concisely describes the two main features introduced: FP32 support and mixed precision support for the PDLP solver, accurately reflecting the primary changes in the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch pdlp_fp32_and_mixed_precision_support

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (6)

docs/cuopt/source/lp-qp-milp-settings.rst (1)

201-207: Prefer an explicit label for the Mixed Precision SpMV cross-reference.

Using a named anchor avoids fragile section-title references and keeps links stable if headings change.

📚 Suggested doc tweak

-For an alternative that maintains FP64 accuracy while improving performance, see :ref:`Mixed Precision SpMV`.
+For an alternative that maintains FP64 accuracy while improving performance, see :ref:`mixed-precision-spmv`.

+.. _mixed-precision-spmv:
 Mixed Precision SpMV
 ^^^^^^^^^^^^^^^^^^^^

As per coding guidelines for docs/**/*, documentation should prioritize consistency and clarity.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@docs/cuopt/source/lp-qp-milp-settings.rst` around lines 201 - 207, Add a
stable named anchor for the "Mixed Precision SpMV" section and use that label in
the earlier cross-reference: insert a label line like "..
_mixed-precision-spmv:" immediately before the "Mixed Precision SpMV" heading,
and replace the existing inline reference "see :ref:`Mixed Precision SpMV`" with
"see :ref:`mixed-precision-spmv`" (or another consistent label name) so the link
remains stable if the heading text changes.

cpp/src/mip_heuristics/presolve/third_party_presolve.cpp (1)

33-46: Avoid full extra copies in same-type vector conversion.

convert_vector takes const&, so when f_t == double, the double -> double path copies very large vectors (Ax, bounds, objective) unnecessarily before PSLP calls.

Proposed refactor

-template <typename To, typename From>
-std::vector<To> convert_vector(const std::vector<From>& src)
+template <typename To, typename From>
+std::vector<To> convert_vector(std::vector<From> src)
 {
   if constexpr (std::is_same_v<To, From>) {
-    return src;  // No conversion needed
+    return std::move(src);  // No conversion needed
   } else {
     std::vector<To> dst(src.size());
     for (size_t i = 0; i < src.size(); ++i) {
       dst[i] = static_cast<To>(src[i]);
     }
     return dst;
   }
 }
@@
-std::vector<double> h_coefficients = convert_vector<double>(h_coefficients_ft);
-std::vector<double> h_obj_coeffs   = convert_vector<double>(h_obj_coeffs_ft);
-std::vector<double> h_var_lb       = convert_vector<double>(h_var_lb_ft);
-std::vector<double> h_var_ub       = convert_vector<double>(h_var_ub_ft);
-std::vector<double> h_constr_lb    = convert_vector<double>(h_constr_lb_ft);
-std::vector<double> h_constr_ub    = convert_vector<double>(h_constr_ub_ft);
+std::vector<double> h_coefficients = convert_vector<double>(std::move(h_coefficients_ft));
+std::vector<double> h_obj_coeffs   = convert_vector<double>(std::move(h_obj_coeffs_ft));
+std::vector<double> h_var_lb       = convert_vector<double>(std::move(h_var_lb_ft));
+std::vector<double> h_var_ub       = convert_vector<double>(std::move(h_var_ub_ft));
+std::vector<double> h_constr_lb    = convert_vector<double>(std::move(h_constr_lb_ft));
+std::vector<double> h_constr_ub    = convert_vector<double>(std::move(h_constr_ub_ft));

As per coding guidelines "Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems".

Also applies to: 295-300

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/presolve/third_party_presolve.cpp` around lines 33 -
46, The current template convert_vector<To,From>(const std::vector<From>&)
unconditionally returns a std::vector<To>, causing a full copy even when
To==From; fix this by adding an overload for the identical-type case: implement
template<typename T> const std::vector<T>& convert_vector(const std::vector<T>&
src) { return src; } and keep the existing two-type template for actual
conversions (std::vector<To> convert_vector(const std::vector<From>&) with
static_cast loop); update usages (e.g., where convert_vector is called for Ax,
bounds, objective and the other occurrence mentioned) to accept/handle a const
reference when available to avoid unnecessary large-vector copies.

benchmarks/linear_programming/cuopt/run_pdlp.cu (1)

193-198: Add fail-fast validation for unsupported flag combinations.

--pdlp-fp32 / --mixed-precision-spmv constraints are documented, but incompatible combinations are only surfaced later by solver validation. A small upfront check in main would return clearer CLI errors earlier and avoid unnecessary setup/parse work.

💡 Suggested guard in main

   bool use_fp32 = program.get<bool>("--pdlp-fp32");
+  const int method = program.get<int>("--method");
+  const bool crossover_enabled = program.get<int>("--crossover") != 0;
+  const bool mixed_spmv = program.get<bool>("--mixed-precision-spmv");
+
+  if (use_fp32 && (method != static_cast<int>(cuopt::linear_programming::method_t::PDLP) ||
+                   crossover_enabled)) {
+    std::cerr << "--pdlp-fp32 is only supported for PDLP method without crossover.\n";
+    return 1;
+  }
+  if (use_fp32 && mixed_spmv) {
+    std::cerr << "--mixed-precision-spmv has no effect in FP32 mode.\n";
+  }

   if (use_fp32) {
     return run_solver<float>(program, handle_);
   } else {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/linear_programming/cuopt/run_pdlp.cu` around lines 193 - 198, The
CLI currently defers reporting incompatible flag combinations until solver
validation; add an early fail-fast check in main: read the boolean for
"--pdlp-fp32" (use_fp32) and the boolean for "--mixed-precision-spmv" from
program (same API used for program.get<bool>(...)) before calling
run_solver<float>/run_solver<double>, and if the combination is unsupported
(e.g. use_fp32 && mixed-precision-spmv true) print a clear error to stderr and
return a non-zero exit code to abort setup immediately; place this guard just
before the existing branch that calls run_solver to avoid unnecessary
parsing/setup work.

cpp/tests/linear_programming/pdlp_test.cu (1)

1930-2055: Add one FP32 test with mixed_precision_spmv=true to verify no-op behavior.

The PR contract says mixed precision SpMV has no effect in FP32 mode. Adding that explicit case would lock this behavior.

As per coding guidelines, **/*test*.{cpp,cu,py}: Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 1930 - 2055, Add a
new FP32 unit test mirroring the existing
float32_papilo_presolve_works/run_float32 pattern that sets
solver_settings.mixed_precision_spmv = true and verifies numerical correctness
(expect CUOPT_TERIMINATION_STATUS_OPTIMAL and that
solution.get_additional_termination_information().primal_objective matches
afiro_primal_objective_f32); locate and modify the pdlp_test.cu tests around
TEST(pdlp_class, run_float32) or TEST(pdlp_class, float32_papilo_presolve_works)
to add TEST(pdlp_class, float32_mixed_precision_spmv_noop) which constructs
op_problem via cuopt::mps_parser::parse_mps<int,float>, sets
solver_settings.method = cuopt::linear_programming::method_t::PDLP and
solver_settings.mixed_precision_spmv = true, calls solve_lp(&handle_,
op_problem, solver_settings), and asserts both termination status is
CUOPT_TERIMINATION_STATUS_OPTIMAL and the primal objective equals
afiro_primal_objective_f32 to guarantee the mixed-precision flag is a no-op for
FP32.

cpp/src/pdlp/cusparse_view.cu (1)

1091-1091: Consider removing unnecessary synchronization in initialization setup.

Line 1091 synchronizes after mixed precision matrix transforms, but no CPU-side code depends on the transformed data before the solve loop. Downstream initialization functions (compute_initial_step_size, compute_initial_primal_weight) use scaled coefficients, not the float matrices. The mixed precision matrices are consumed only later in GPU kernels on the same stream, where stream ordering ensures correctness without explicit host-blocking synchronization.

Note: This is initialization code, not a hot solver iteration path, so the performance impact is minimal. However, the synchronization may be redundant for correctness.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/cusparse_view.cu` at line 1091, Remove the blocking host
synchronization call handle_ptr_->get_stream().synchronize() after creating the
mixed-precision matrices; instead rely on CUDA stream ordering so GPU kernels
that consume the mixed-precision data run on the same stream without host-side
synchronize. Locate the call in cusparse_view.cu (the initialization path where
mixed-precision transforms are performed) and delete the synchronize()
invocation; keep the rest of the initialization (including
compute_initial_step_size and compute_initial_primal_weight) unchanged since
they only use scaled coefficients on the host and do not require the float
matrices to be host-visible before the solve kernels execute.

cpp/src/pdlp/pdlp.cu (1)

2974-2996: Consider adding a clarifying comment for the asymmetric instantiation macros.

The PDLP_INSTANTIATE_FLOAT vs MIP_INSTANTIATE_DOUBLE asymmetry is intentional (FP32 is PDLP-specific, while MIP always uses FP64 and relies on PDLP double instantiation). A brief comment would help future maintainers understand this design decision.

📝 Suggested documentation

+// PDLP supports both FP32 and FP64 precision modes.
+// Float instantiation is controlled by PDLP_INSTANTIATE_FLOAT (PDLP-specific).
+// Double instantiation uses MIP_INSTANTIATE_DOUBLE since MIP depends on PDLP<double>.
 `#if` PDLP_INSTANTIATE_FLOAT
 template class pdlp_solver_t<int, float>;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 2974 - 2996, Add a short clarifying
comment above the asymmetric instantiation blocks explaining why
PDLP_INSTANTIATE_FLOAT and MIP_INSTANTIATE_DOUBLE differ: FP32 instantiation is
PDLP-specific while MIP always uses FP64 and depends on PDLP double
instantiation; place the comment immediately above the two preprocessor blocks
that instantiate pdlp_solver_t and the
compute_weights_initial_primal_weight_from_squared_norms function templates so
future maintainers see the rationale for the asymmetry.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/math_optimization/solver_settings.cu`:
- Around line 74-75: The FP32 default tolerances
(CUOPT_PRIMAL_INFEASIBLE_TOLERANCE, CUOPT_DUAL_INFEASIBLE_TOLERANCE and
CUOPT_MIP_ABSOLUTE_GAP) use f_t(1e-10) which is below float epsilon and may
underflow to denorm/zero; update the default construction to clamp to the type's
machine epsilon (e.g., replace raw f_t(1e-10) with std::max(f_t(1e-10),
std::numeric_limits<f_t>::epsilon()) or choose separate defaults for float vs
double) so pdlp_settings.tolerances.primal_infeasible_tolerance and
dual_infeasible_tolerance (and the MIP absolute gap default) are representable
in FP32.

In `@cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu`:
- Around line 182-184: The float instantiation guard in simple_rounding.cu is
using PDLP_INSTANTIATE_FLOAT which is inconsistent with the double guard
(MIP_INSTANTIATE_DOUBLE) and other MIP files; update the preprocessor check
around the INSTANTIATE(float) call to use MIP_INSTANTIATE_FLOAT to match the
file's MIP-specific instantiation pattern (or, only if this code truly needs to
be shared with PDLP, use the combined guard MIP_INSTANTIATE_FLOAT ||
PDLP_INSTANTIATE_FLOAT) so that INSTANTIATE(float) follows the same macro
convention as INSTANTIATE(double) and other MIP files.

In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 599-667: The code unconditionally allocates mixed-precision
buffers and creates mixed cuSPARSE descriptors (A_float_, A_T_float_, A_mixed_,
A_T_mixed_, buffer_non_transpose_mixed_, buffer_transpose_mixed_, and calls to
mixed_precision_spmv_buffersize) whenever enable_mixed_precision_spmv is true,
which wastes memory for batch paths that use SpMM on the FP64 descriptors; wrap
the entire mixed-precision allocation block so it only runs when mixed precision
is enabled AND the current execution will use SpMV (e.g., !is_batch_mode or a
flag like use_spmv) — move the allocations, cub::DeviceTransform calls,
cusparseCreateCsr calls, and buffer_size computations inside that guard and skip
them in batch/SpMM execution paths to avoid unnecessary memory use.
- Around line 606-616: The two unwrapped calls to
cub::DeviceTransform::Transform that convert doubles to floats (the calls using
double_to_float_functor with sources op_problem_scaled.coefficients and A_T_.
and destinations A_float_ and A_T_float_ respectively, using
handle_ptr->get_stream().value()) must be wrapped with RAFT_CUDA_TRY to surface
CUDA errors; locate the Transform invocations and wrap each call with
RAFT_CUDA_TRY( ... ) so any returned cudaError_t is checked, and apply the same
change for the identical pair of Transform calls found later in the file.

In `@cpp/src/pdlp/cusparse_view.hpp`:
- Around line 201-207: Replace the raw cusparseSpMatDescr_t handles A_mixed_ and
A_T_mixed_ with the existing RAII wrapper type cusparse_sp_mat_descr_wrapper_t
(i.e., change their declarations to use cusparse_sp_mat_descr_wrapper_t instead
of cusparseSpMatDescr_t), update the code paths that call cusparseCreateCsr to
assign into these wrappers, and remove any manual calls to cusparseDestroySpMat
(relying on the wrapper destructor). Ensure all references to A_mixed_,
A_T_mixed_, cusparseCreateCsr and mixed_precision_enabled_ compile against the
wrapper API so the descriptors are automatically destroyed.

In `@cpp/src/pdlp/pdhg.hpp`:
- Around line 32-33: The constructor signature in pdhg.hpp currently defaults
enable_mixed_precision_spmv to true, which can silently change behavior; change
the default to false at the constructor boundary by updating the parameter
default from enable_mixed_precision_spmv = true to enable_mixed_precision_spmv =
false in the PDHG (pdhg) constructor declaration so callers that omit this
argument keep the documented default-off behavior.

In `@cpp/src/pdlp/solve.cu`:
- Around line 607-612: The FP32+crossover unsupported check currently inside the
template block (f_t) runs too late; move the validation into the beginning of
run_pdlp (or the earlier dispatch function that selects the f_t instantiation)
so invalid configurations fail fast. Specifically, add a guard that mirrors the
existing cuopt_expects(!settings.crossover, ...) check at the top of run_pdlp
(or immediately before selecting/instantiating the PDLP template) to throw a
ValidationError when std::is_same_v<f_t, float> (or when dispatch would pick
f_t=float) and settings.crossover==true; reference run_pdlp, the dispatch call
site for f_t, and the existing cuopt_expects check to locate and duplicate the
logic early in the call path.

---

Nitpick comments:
In `@benchmarks/linear_programming/cuopt/run_pdlp.cu`:
- Around line 193-198: The CLI currently defers reporting incompatible flag
combinations until solver validation; add an early fail-fast check in main: read
the boolean for "--pdlp-fp32" (use_fp32) and the boolean for
"--mixed-precision-spmv" from program (same API used for program.get<bool>(...))
before calling run_solver<float>/run_solver<double>, and if the combination is
unsupported (e.g. use_fp32 && mixed-precision-spmv true) print a clear error to
stderr and return a non-zero exit code to abort setup immediately; place this
guard just before the existing branch that calls run_solver to avoid unnecessary
parsing/setup work.

In `@cpp/src/mip_heuristics/presolve/third_party_presolve.cpp`:
- Around line 33-46: The current template convert_vector<To,From>(const
std::vector<From>&) unconditionally returns a std::vector<To>, causing a full
copy even when To==From; fix this by adding an overload for the identical-type
case: implement template<typename T> const std::vector<T>& convert_vector(const
std::vector<T>& src) { return src; } and keep the existing two-type template for
actual conversions (std::vector<To> convert_vector(const std::vector<From>&)
with static_cast loop); update usages (e.g., where convert_vector is called for
Ax, bounds, objective and the other occurrence mentioned) to accept/handle a
const reference when available to avoid unnecessary large-vector copies.

In `@cpp/src/pdlp/cusparse_view.cu`:
- Line 1091: Remove the blocking host synchronization call
handle_ptr_->get_stream().synchronize() after creating the mixed-precision
matrices; instead rely on CUDA stream ordering so GPU kernels that consume the
mixed-precision data run on the same stream without host-side synchronize.
Locate the call in cusparse_view.cu (the initialization path where
mixed-precision transforms are performed) and delete the synchronize()
invocation; keep the rest of the initialization (including
compute_initial_step_size and compute_initial_primal_weight) unchanged since
they only use scaled coefficients on the host and do not require the float
matrices to be host-visible before the solve kernels execute.

In `@cpp/src/pdlp/pdlp.cu`:
- Around line 2974-2996: Add a short clarifying comment above the asymmetric
instantiation blocks explaining why PDLP_INSTANTIATE_FLOAT and
MIP_INSTANTIATE_DOUBLE differ: FP32 instantiation is PDLP-specific while MIP
always uses FP64 and depends on PDLP double instantiation; place the comment
immediately above the two preprocessor blocks that instantiate pdlp_solver_t and
the compute_weights_initial_primal_weight_from_squared_norms function templates
so future maintainers see the rationale for the asymmetry.

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 1930-2055: Add a new FP32 unit test mirroring the existing
float32_papilo_presolve_works/run_float32 pattern that sets
solver_settings.mixed_precision_spmv = true and verifies numerical correctness
(expect CUOPT_TERIMINATION_STATUS_OPTIMAL and that
solution.get_additional_termination_information().primal_objective matches
afiro_primal_objective_f32); locate and modify the pdlp_test.cu tests around
TEST(pdlp_class, run_float32) or TEST(pdlp_class, float32_papilo_presolve_works)
to add TEST(pdlp_class, float32_mixed_precision_spmv_noop) which constructs
op_problem via cuopt::mps_parser::parse_mps<int,float>, sets
solver_settings.method = cuopt::linear_programming::method_t::PDLP and
solver_settings.mixed_precision_spmv = true, calls solve_lp(&handle_,
op_problem, solver_settings), and asserts both termination status is
CUOPT_TERIMINATION_STATUS_OPTIMAL and the primal objective equals
afiro_primal_objective_f32 to guarantee the mixed-precision flag is a no-op for
FP32.

In `@docs/cuopt/source/lp-qp-milp-settings.rst`:
- Around line 201-207: Add a stable named anchor for the "Mixed Precision SpMV"
section and use that label in the earlier cross-reference: insert a label line
like ".. _mixed-precision-spmv:" immediately before the "Mixed Precision SpMV"
heading, and replace the existing inline reference "see :ref:`Mixed Precision
SpMV`" with "see :ref:`mixed-precision-spmv`" (or another consistent label name)
so the link remains stable if the heading text changes.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b4d2726 and 64c5457.

📒 Files selected for processing (42)

benchmarks/linear_programming/cuopt/run_pdlp.cu
cpp/include/cuopt/linear_programming/pdlp/solver_settings.hpp
cpp/src/dual_simplex/sparse_matrix.cpp
cpp/src/math_optimization/solution_writer.cu
cpp/src/math_optimization/solution_writer.hpp
cpp/src/math_optimization/solver_settings.cu
cpp/src/mip_heuristics/diversity/lns/rins.cu
cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu
cpp/src/mip_heuristics/mip_constants.hpp
cpp/src/mip_heuristics/presolve/gf2_presolve.cpp
cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
cpp/src/mip_heuristics/problem/presolve_data.cu
cpp/src/mip_heuristics/problem/problem.cu
cpp/src/mip_heuristics/solution/solution.cu
cpp/src/mip_heuristics/solver_solution.cu
cpp/src/pdlp/cpu_pdlp_warm_start_data.cu
cpp/src/pdlp/cusparse_view.cu
cpp/src/pdlp/cusparse_view.hpp
cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
cpp/src/pdlp/optimal_batch_size_handler/optimal_batch_size_handler.cu
cpp/src/pdlp/optimization_problem.cu
cpp/src/pdlp/pdhg.cu
cpp/src/pdlp/pdhg.hpp
cpp/src/pdlp/pdlp.cu
cpp/src/pdlp/pdlp_warm_start_data.cu
cpp/src/pdlp/restart_strategy/localized_duality_gap_container.cu
cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
cpp/src/pdlp/restart_strategy/weighted_average_solution.cu
cpp/src/pdlp/saddle_point.cu
cpp/src/pdlp/solution_conversion.cu
cpp/src/pdlp/solve.cu
cpp/src/pdlp/solver_settings.cu
cpp/src/pdlp/solver_solution.cu
cpp/src/pdlp/step_size_strategy/adaptive_step_size_strategy.cu
cpp/src/pdlp/termination_strategy/convergence_information.cu
cpp/src/pdlp/termination_strategy/infeasibility_information.cu
cpp/src/pdlp/termination_strategy/termination_strategy.cu
cpp/src/pdlp/translate.hpp
cpp/src/pdlp/utilities/problem_checking.cu
cpp/tests/linear_programming/pdlp_test.cu
docs/cuopt/source/lp-qp-features.rst
docs/cuopt/source/lp-qp-milp-settings.rst

cpp/src/math_optimization/solver_settings.cu

cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu

cpp/src/pdlp/cusparse_view.cu

cpp/src/pdlp/cusparse_view.hpp

cpp/src/pdlp/pdhg.hpp

cpp/src/pdlp/solve.cu

Kh4ster · 2026-02-27T15:06:41Z

/ok to test 366fd6a

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/tests/linear_programming/pdlp_test.cu (1)
54-59: ⚠️ Potential issue | 🟠 Major

Add NaN/Inf guard to is_incorrect_objective to prevent masked solver regressions.

The function accepts NaN values silently. When objective is NaN, std::abs(objective) returns NaN, and the comparisons evaluate to false, causing the helper to incorrectly report the objective as valid. This can mask solver regressions. The codebase already uses std::isfinite extensively in solver components (pdlp, feasibility_jump, barrier, etc.) for this exact purpose.
Suggested fix
 template <typename f_t>
 static bool is_incorrect_objective(f_t reference, f_t objective)
 {
+  if (!std::isfinite(reference) || !std::isfinite(objective)) { return true; }
   if (reference == 0) { return std::abs(objective) > 0.01; }
   if (objective == 0) { return std::abs(reference) > 0.01; }
   return std::abs((reference - objective) / reference) > 0.01;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 54 - 59, The helper
is_incorrect_objective must treat NaN/Inf as incorrect: at the start of
is_incorrect_objective(f_t reference, f_t objective) check
std::isfinite(reference) and std::isfinite(objective) and return true if either
is not finite, then proceed with the existing zero checks and
relative-difference logic; update includes/usings if necessary to ensure
std::isfinite is available and referenced.

🧹 Nitpick comments (4)

cpp/src/pdlp/cpu_pdlp_warm_start_data.cu (1)

16-39: Consider batching synchronization for better performance (optional).

Each helper function synchronizes after its copy operation. When convert_to_cpu_warmstart or convert_to_gpu_warmstart copies 9 vector fields, this results in 9 separate synchronization points.

For warm start data that isn't in a hot path, this is acceptable. However, if performance becomes a concern, consider refactoring to batch all copies and synchronize once at the end of the conversion functions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/cpu_pdlp_warm_start_data.cu` around lines 16 - 39, The helpers
device_to_host_vector and host_to_device_vector currently call
stream.synchronize() after each raft::copy, causing multiple sync points when
convert_to_cpu_warmstart/convert_to_gpu_warmstart copy many fields; change the
helpers to avoid per-copy synchronization by either (a) removing the
stream.synchronize() calls and documenting that the caller must synchronize the
stream once after batching copies, or (b) adding an optional parameter (e.g.
bool do_sync = true) so callers can disable per-copy sync and perform a single
stream.synchronize() at the end of
convert_to_cpu_warmstart/convert_to_gpu_warmstart; update those convert_*
functions to batch the copies and call stream.synchronize() once.

cpp/tests/linear_programming/pdlp_test.cu (1)

1931-2054: Add an explicit FP32 “mixed_precision_spmv is a no-op” test.

PR behavior states mixed-precision SpMV has no effect in FP32 mode, but this block doesn’t directly assert that contract yet.

Suggested test addition

+TEST(pdlp_class, float32_mixed_precision_spmv_no_effect)
+{
+  const raft::handle_t handle_{};
+  auto path = make_path_absolute("linear_programming/afiro_original.mps");
+  cuopt::mps_parser::mps_data_model_t<int, float> op_problem =
+    cuopt::mps_parser::parse_mps<int, float>(path, true);
+
+  auto settings_base   = pdlp_solver_settings_t<int, float>{};
+  settings_base.method = cuopt::linear_programming::method_t::PDLP;
+  settings_base.mixed_precision_spmv = false;
+
+  auto settings_mixed = settings_base;
+  settings_mixed.mixed_precision_spmv = true;
+
+  auto solution_base  = solve_lp(&handle_, op_problem, settings_base);
+  auto solution_mixed = solve_lp(&handle_, op_problem, settings_mixed);
+
+  EXPECT_EQ((int)solution_base.get_termination_status(), CUOPT_TERIMINATION_STATUS_OPTIMAL);
+  EXPECT_EQ((int)solution_mixed.get_termination_status(), CUOPT_TERIMINATION_STATUS_OPTIMAL);
+  EXPECT_NEAR(solution_base.get_additional_termination_information().primal_objective,
+              solution_mixed.get_additional_termination_information().primal_objective,
+              1e-2f);
+}

As per coding guidelines: **/*test*.{cpp,cu,py}: Write tests validating numerical correctness of optimization results (not just 'runs without error').

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/linear_programming/pdlp_test.cu` around lines 1931 - 2054, Add a
test that explicitly verifies mixed-precision SpMV is a no-op in FP32: create a
pdlp_solver_settings_t<int,float> (same as in run_float32) set method =
cuopt::linear_programming::method_t::PDLP and set
solver_settings.mixed_precision_spmv = true, call solve_lp(&handle_, op_problem,
solver_settings), then assert termination status equals
CUOPT_TERIMINATION_STATUS_OPTIMAL and that
solution.get_additional_termination_information().primal_objective matches
afiro_primal_objective_f32 (use is_incorrect_objective to check numerical
equality), mirroring the checks in the existing run_float32/papilo/pslp tests so
the behavior is explicitly tested.

cpp/src/pdlp/pdhg.cu (1)

254-280: Consolidate repeated mixed-vs-standard SpMV dispatch into one helper.

The same precision-dispatch branch is repeated in three methods. A shared helper would reduce drift risk and keep future algorithm/descriptor changes in one place.

♻️ Suggested refactor sketch

+template <typename i_t, typename f_t>
+template <typename MatStd, typename MatMixed, typename VecX, typename VecY, typename BufStd, typename BufMixed>
+inline void pdhg_solver_t<i_t, f_t>::run_spmv_dispatch(MatStd mat_std,
+                                                        MatMixed mat_mixed,
+                                                        VecX vec_x,
+                                                        VecY vec_y,
+                                                        BufStd buf_std,
+                                                        BufMixed buf_mixed)
+{
+  if constexpr (std::is_same_v<f_t, double>) {
+    if (cusparse_view_.mixed_precision_enabled_) {
+      mixed_precision_spmv(handle_ptr_->get_cusparse_handle(),
+                           CUSPARSE_OPERATION_NON_TRANSPOSE,
+                           reusable_device_scalar_value_1_.data(),
+                           mat_mixed,
+                           vec_x,
+                           reusable_device_scalar_value_0_.data(),
+                           vec_y,
+                           CUSPARSE_SPMV_CSR_ALG2,
+                           buf_mixed,
+                           stream_view_);
+      return;
+    }
+  }
+  RAFT_CUSPARSE_TRY(raft::sparse::detail::cusparsespmv(handle_ptr_->get_cusparse_handle(),
+                                                        CUSPARSE_OPERATION_NON_TRANSPOSE,
+                                                        reusable_device_scalar_value_1_.data(),
+                                                        mat_std,
+                                                        vec_x,
+                                                        reusable_device_scalar_value_0_.data(),
+                                                        vec_y,
+                                                        CUSPARSE_SPMV_CSR_ALG2,
+                                                        (f_t*)buf_std,
+                                                        stream_view_));
+}

As per coding guidelines, Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication.

Also applies to: 308-334, 356-382

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdhg.cu` around lines 254 - 280, Duplicate precision-dispatch
SpMV logic should be extracted into a single helper: create a templated utility
(e.g., spmv_dispatch or dispatch_spmv) that accepts the cusparse handle
(handle_ptr_->get_cusparse_handle()), operation flag, alpha/beta device scalars
(reusable_device_scalar_value_1_.data(),
reusable_device_scalar_value_0_.data()), both mixed and standard matrix pointers
(cusparse_view_.A_mixed_, cusparse_view_.A), input/output buffers
(cusparse_view_.tmp_primal, cusparse_view_.dual_gradient), algorithm/buffer args
(CUSPARSE_SPMV_CSR_ALG2 and both
buffer_non_transpose_mixed_.data()/buffer_non_transpose.data()), and the stream
(stream_view_), then replace the repeated branches in pdhg.cu (and the other two
occurrences) with calls to that helper which checks
cusparse_view_.mixed_precision_enabled_ and calls mixed_precision_spmv or
raft::sparse::detail::cusparsespmv accordingly.

cpp/src/pdlp/cusparse_view.cu (1)

1069-1082: Remove the blocking stream synchronization after mixed-precision matrix transforms.

Line 1081's handle_ptr_->get_stream().synchronize() blocks the host after submitting two matrix transforms (A_ → A_float_ and A_T_ → A_T_float_) to the same GPU stream. Since subsequent mixed-precision SpMV operations are queued on the same stream, the hardware will ensure correct ordering without explicit host stall. Removing this synchronization eliminates an unnecessary pipeline blockage in the hot path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/cusparse_view.cu` around lines 1069 - 1082, The explicit
host-side stall is caused by handle_ptr_->get_stream().synchronize() after
launching two cub::DeviceTransform::Transform calls that convert A_→A_float_ and
A_T_→A_T_float_; remove the synchronize call so the transforms run
asynchronously on the same stream (GPU will preserve ordering) and let
subsequent mixed-precision SpMV kernels queue on that stream without blocking
the host; ensure no other code relies on the host wait and delete only the
handle_ptr_->get_stream().synchronize() invocation referenced here.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/pdlp/solve.cu`:
- Around line 1151-1156: The current hard validation using cuopt_expects blocks
float-precision users; update solve_lp() to auto-map float precision to
method_t::PDLP when the user has not explicitly selected an incompatible method
(i.e., settings.method is still method_t::Concurrent/default) instead of
failing: detect float precision and if settings.method == method_t::Concurrent,
set settings.method = method_t::PDLP and emit a log/info message about the
override; keep the existing cuopt_expects check only for the case where the user
explicitly set an incompatible method (e.g., DualSimplex/Barrier) so an error is
still raised for explicit invalid combinations. Ensure you reference and modify
the code paths around solve_lp(), settings.method, and the float-precision
branch that currently calls run_pdlp().

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Line 23: Update the preprocessor guard that gates the PDLP float tests:
replace the overly-broad conditional that uses MIP_INSTANTIATE_FLOAT ||
PDLP_INSTANTIATE_FLOAT with a check that only uses PDLP_INSTANTIATE_FLOAT so
PDLP float tests compile only when PDLP is instantiated for float; locate the
conditional in pdlp_test.cu (the lines surrounding the existing `#if`
MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT) and change it to `#if`
PDLP_INSTANTIATE_FLOAT, leaving the rest of the test code unchanged.

---

Outside diff comments:
In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 54-59: The helper is_incorrect_objective must treat NaN/Inf as
incorrect: at the start of is_incorrect_objective(f_t reference, f_t objective)
check std::isfinite(reference) and std::isfinite(objective) and return true if
either is not finite, then proceed with the existing zero checks and
relative-difference logic; update includes/usings if necessary to ensure
std::isfinite is available and referenced.

---

Nitpick comments:
In `@cpp/src/pdlp/cpu_pdlp_warm_start_data.cu`:
- Around line 16-39: The helpers device_to_host_vector and host_to_device_vector
currently call stream.synchronize() after each raft::copy, causing multiple sync
points when convert_to_cpu_warmstart/convert_to_gpu_warmstart copy many fields;
change the helpers to avoid per-copy synchronization by either (a) removing the
stream.synchronize() calls and documenting that the caller must synchronize the
stream once after batching copies, or (b) adding an optional parameter (e.g.
bool do_sync = true) so callers can disable per-copy sync and perform a single
stream.synchronize() at the end of
convert_to_cpu_warmstart/convert_to_gpu_warmstart; update those convert_*
functions to batch the copies and call stream.synchronize() once.

In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 1069-1082: The explicit host-side stall is caused by
handle_ptr_->get_stream().synchronize() after launching two
cub::DeviceTransform::Transform calls that convert A_→A_float_ and
A_T_→A_T_float_; remove the synchronize call so the transforms run
asynchronously on the same stream (GPU will preserve ordering) and let
subsequent mixed-precision SpMV kernels queue on that stream without blocking
the host; ensure no other code relies on the host wait and delete only the
handle_ptr_->get_stream().synchronize() invocation referenced here.

In `@cpp/src/pdlp/pdhg.cu`:
- Around line 254-280: Duplicate precision-dispatch SpMV logic should be
extracted into a single helper: create a templated utility (e.g., spmv_dispatch
or dispatch_spmv) that accepts the cusparse handle
(handle_ptr_->get_cusparse_handle()), operation flag, alpha/beta device scalars
(reusable_device_scalar_value_1_.data(),
reusable_device_scalar_value_0_.data()), both mixed and standard matrix pointers
(cusparse_view_.A_mixed_, cusparse_view_.A), input/output buffers
(cusparse_view_.tmp_primal, cusparse_view_.dual_gradient), algorithm/buffer args
(CUSPARSE_SPMV_CSR_ALG2 and both
buffer_non_transpose_mixed_.data()/buffer_non_transpose.data()), and the stream
(stream_view_), then replace the repeated branches in pdhg.cu (and the other two
occurrences) with calls to that helper which checks
cusparse_view_.mixed_precision_enabled_ and calls mixed_precision_spmv or
raft::sparse::detail::cusparsespmv accordingly.

In `@cpp/tests/linear_programming/pdlp_test.cu`:
- Around line 1931-2054: Add a test that explicitly verifies mixed-precision
SpMV is a no-op in FP32: create a pdlp_solver_settings_t<int,float> (same as in
run_float32) set method = cuopt::linear_programming::method_t::PDLP and set
solver_settings.mixed_precision_spmv = true, call solve_lp(&handle_, op_problem,
solver_settings), then assert termination status equals
CUOPT_TERIMINATION_STATUS_OPTIMAL and that
solution.get_additional_termination_information().primal_objective matches
afiro_primal_objective_f32 (use is_incorrect_objective to check numerical
equality), mirroring the checks in the existing run_float32/papilo/pslp tests so
the behavior is explicitly tested.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 64c5457 and 366fd6a.

📒 Files selected for processing (31)

cpp/src/dual_simplex/sparse_matrix.cpp
cpp/src/math_optimization/solution_writer.cu
cpp/src/math_optimization/solver_settings.cu
cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu
cpp/src/mip_heuristics/problem/problem.cu
cpp/src/mip_heuristics/solution/solution.cu
cpp/src/mip_heuristics/solver_solution.cu
cpp/src/pdlp/cpu_pdlp_warm_start_data.cu
cpp/src/pdlp/cusparse_view.cu
cpp/src/pdlp/cusparse_view.hpp
cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
cpp/src/pdlp/optimal_batch_size_handler/optimal_batch_size_handler.cu
cpp/src/pdlp/optimization_problem.cu
cpp/src/pdlp/pdhg.cu
cpp/src/pdlp/pdhg.hpp
cpp/src/pdlp/pdlp.cu
cpp/src/pdlp/pdlp_warm_start_data.cu
cpp/src/pdlp/restart_strategy/localized_duality_gap_container.cu
cpp/src/pdlp/restart_strategy/pdlp_restart_strategy.cu
cpp/src/pdlp/restart_strategy/weighted_average_solution.cu
cpp/src/pdlp/saddle_point.cu
cpp/src/pdlp/solution_conversion.cu
cpp/src/pdlp/solve.cu
cpp/src/pdlp/solver_settings.cu
cpp/src/pdlp/solver_solution.cu
cpp/src/pdlp/step_size_strategy/adaptive_step_size_strategy.cu
cpp/src/pdlp/termination_strategy/convergence_information.cu
cpp/src/pdlp/termination_strategy/infeasibility_information.cu
cpp/src/pdlp/termination_strategy/termination_strategy.cu
cpp/src/pdlp/utilities/problem_checking.cu
cpp/tests/linear_programming/pdlp_test.cu

🚧 Files skipped from review as they are similar to previous changes (15)

cpp/src/mip_heuristics/solution/solution.cu
cpp/src/pdlp/optimal_batch_size_handler/optimal_batch_size_handler.cu
cpp/src/pdlp/pdhg.hpp
cpp/src/pdlp/solution_conversion.cu
cpp/src/pdlp/pdlp_warm_start_data.cu
cpp/src/pdlp/solver_solution.cu
cpp/src/pdlp/saddle_point.cu
cpp/src/pdlp/step_size_strategy/adaptive_step_size_strategy.cu
cpp/src/pdlp/termination_strategy/termination_strategy.cu
cpp/src/pdlp/termination_strategy/infeasibility_information.cu
cpp/src/pdlp/solver_settings.cu
cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu
cpp/src/pdlp/restart_strategy/weighted_average_solution.cu
cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu
cpp/src/mip_heuristics/problem/problem.cu

coderabbitai · 2026-02-27T15:12:01Z

cpp/src/pdlp/solve.cu

+    // Float precision only supports PDLP without presolve/crossover
+    cuopt_expects(settings.method == method_t::PDLP,
+                  error_type_t::ValidationError,
+                  "Float precision only supports PDLP method. DualSimplex, Barrier, and Concurrent "
+                  "require double precision.");
    return run_pdlp(problem, settings, timer, is_batch_mode);


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check default method initialization ==" fd 'solver_settings\.hpp$' -o 'solver_settings\.cu$' | head -20 | xargs -r rg -n "method_t|method\s*=" | head -40 echo "" echo "== Check FP32 flag handling and method assignment in CLI / dispatch ==" rg -n "pdlp-fp32|PDLP_INSTANTIATE_FLOAT|method_t::PDLP|CUOPT_METHOD" | head -60 echo "" echo "== Check the solve.cu file and method dispatch ==" fd 'solve\.cu$' | head -10

Repository: NVIDIA/cuopt

Length of output: 274

🏁 Script executed:

#!/bin/bash # Let's specifically look at the solve.cu file around line 1151 and trace back echo "== Context around line 1151-1156 in solve.cu ==" fd 'pdlp/solve\.cu$' | xargs -r sed -n '1140,1170p' echo "" echo "== Trace back to function entry point ==" fd 'pdlp/solve\.cu$' | xargs -r sed -n '1100,1160p' | head -80

Repository: NVIDIA/cuopt

Length of output: 145

🏁 Script executed:

#!/bin/bash # Search for how float settings are constructed and method is set echo "== Search for solver_settings_t<float> instantiation ==" rg -n "solver_settings_t\s*<\s*float|pdlp_solver_settings_t\s*<\s*float" -A 5 echo "" echo "== Search for method assignment before solve ==" rg -n "settings\.method\s*=" | grep -v "==" | head -20

Repository: NVIDIA/cuopt

Length of output: 2752

🏁 Script executed:

#!/bin/bash # Find the actual solve.cu file location echo "== Find solve.cu files ==" find . -name "solve.cu" -type f 2>/dev/null | head -20 echo "" echo "== Check which solve.cu contains the validation check ==" grep -r "Float precision only supports PDLP" --include="*.cu" --include="*.hpp" --include="*.h" 2>/dev/null | head -10

Repository: NVIDIA/cuopt

Length of output: 438

🏁 Script executed:

#!/bin/bash # Once we find the file, let's look at the broader context FILE=$(grep -r "Float precision only supports PDLP" --include="*.cu" -l 2>/dev/null | head -1) echo "Found file: $FILE" if [ -n "$FILE" ]; then echo "" echo "== Lines 1100-1200 (broader context) ==" sed -n '1100,1200p' "$FILE" | head -100 fi

Repository: NVIDIA/cuopt

Length of output: 4595

🏁 Script executed:

#!/bin/bash # Search for float-related method handling and defaults echo "== Search for float or FP32 in context of method selection ==" rg -n "float|FP32|fp32" -B 2 -A 2 --include="*.cu" --include="*.hpp" | grep -i "method\|pdlp\|float" | head -80 echo "" echo "== Look for default method values in solver_settings ==" find . -name "*solver_settings*" -type f | xargs grep -n "method.*=" | grep -i "default\|=\s*method_t" | head -30

Repository: NVIDIA/cuopt

Length of output: 633

🏁 Script executed:

#!/bin/bash # Search for callers of solve_lp echo "== Find callers of solve_lp ==" rg -n "solve_lp\(" --type cu --type cpp --type hpp | head -30 echo "" echo "== Check solver_settings initialization/defaults ==" find . -name "*solver_settings*" -type f | head -10

Repository: NVIDIA/cuopt

Length of output: 696

🏁 Script executed:

#!/bin/bash # Look at solver_settings headers to find default method fd "solver_settings" -t f | xargs -r cat 2>/dev/null | grep -n "method" | head -40

Repository: NVIDIA/cuopt

Length of output: 1585

🏁 Script executed:

#!/bin/bash # Check how float precision settings are handled in CLI or main dispatch echo "== Search for float/FP32 handling in solve dispatch ==" rg -n "is_same.*float|float.*instantiate|PDLP_INSTANTIATE" -B 3 -A 3 | head -100 echo "" echo "== Check for method override based on precision ==" rg -n "settings\.method\s*=\s*method_t::PDLP" | head -20

Repository: NVIDIA/cuopt

Length of output: 7282

🏁 Script executed:

#!/bin/bash # Find the public API that calls solve_lp echo "== Search for solve function signatures in public headers ==" fd ".*\.hpp" -path "*/linear_programming/*" | head -10 | xargs grep -n "solve\(" | grep -v "run_\|solve_lp" | head -20 echo "" echo "== Check the main solve entry point ==" rg -n "optimization_problem_solution_t.*solve\(" -A 10 | head -80

Repository: NVIDIA/cuopt

Length of output: 3448

🏁 Script executed:

#!/bin/bash # Look for where float dispatch happens echo "== Find main solve implementation that dispatches to solve_lp ==" fd "*.cpp" -o "*.cu" | head -20 | xargs grep -l "solve_lp\(" 2>/dev/null | head -5 echo "" echo "== Check if method is set to PDLP for float in any dispatch ==" rg -n "if.*float\|if constexpr.*float" -B 3 -A 3 | grep -i "method\|pdlp" | head -40

Repository: NVIDIA/cuopt

Length of output: 329

🏁 Script executed:

#!/bin/bash # Look for the solve function in the public API headers echo "== Check linear_programming.hpp for solve entry point ==" fd "linear_programming.hpp" | xargs cat 2>/dev/null | grep -n "solve" | head -30

Repository: NVIDIA/cuopt

Length of output: 112

🏁 Script executed:

#!/bin/bash echo "== Read solve.hpp header for main solve API ==" cat cpp/include/cuopt/linear_programming/solve.hpp | head -150

Repository: NVIDIA/cuopt

Length of output: 6828

🏁 Script executed:

#!/bin/bash echo "== Search for main solve function in solve.hpp ==" grep -n "^optimization_problem_solution_t.*solve\|^template.*solve" cpp/include/cuopt/linear_programming/solve.hpp | head -20 echo "" echo "== Search entire solve.cu for solve function definitions (not solve_lp) ==" grep -n "^optimization_problem_solution_t.*solve\|^template.*solve" cpp/src/pdlp/solve.cu | head -20

Repository: NVIDIA/cuopt

Length of output: 602

🏁 Script executed:

#!/bin/bash echo "== Check if there's a wrapper solve function that dispatches based on type ==" rg -n "template.*optimization_problem_solution_t" cpp/src/pdlp/solve.cu -A 10 | head -100

Repository: NVIDIA/cuopt

Length of output: 3422

🏁 Script executed:

#!/bin/bash echo "== Read solve_lp implementation starting at line 1161 ==" sed -n '1161,1330p' cpp/src/pdlp/solve.cu

Repository: NVIDIA/cuopt

Length of output: 8559

🏁 Script executed:

#!/bin/bash echo "== Check for any settings.method assignment or override logic ==" sed -n '1161,1330p' cpp/src/pdlp/solve.cu | grep -n "settings.method\|method_t::"

Repository: NVIDIA/cuopt

Length of output: 169

🏁 Script executed:

#!/bin/bash echo "== Check batch_pdlp_solve implementation for method override ==" sed -n '912,980p' cpp/src/pdlp/solve.cu | grep -n "method\|settings"

Repository: NVIDIA/cuopt

Length of output: 918

🏁 Script executed:

#!/bin/bash echo "== Search for any CLI or user-facing handling of float/fp32 ==" find . -name "*.py" -o -name "*.cpp" | xargs grep -l "pdlp.*fp32\|float.*method\|PDLP_INSTANTIATE_FLOAT" 2>/dev/null | head -10 echo "" echo "== Check if there's a wrapper that enforces PDLP for float before calling solve_lp ==" rg -n "is_same.*float" -B 5 -A 5 | head -80

Repository: NVIDIA/cuopt

Length of output: 6115

Auto-map float precision to PDLP instead of enforcing hard method validation.

The concern is valid. With default method=Concurrent in solver_settings and no automatic override for float precision in solve_lp(), users requesting float precision will hit the validation error at line 1151-1156 unless they explicitly set method=PDLP beforehand. This is an unnecessary UX friction since float support is specifically PDLP-only.

Rather than enforce a hard cuopt_expects validation, auto-map float mode to method_t::PDLP when the method is unspecified or left at default Concurrent. Only error on explicit, incompatible method requests (e.g., user explicitly selects DualSimplex with float precision).

Add logic in solve_lp() analogous to the quadratic objective override: if float precision and method is Concurrent (the default), silently reassign to PDLP with a log message.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/src/pdlp/solve.cu` around lines 1151 - 1156, The current hard validation using cuopt_expects blocks float-precision users; update solve_lp() to auto-map float precision to method_t::PDLP when the user has not explicitly selected an incompatible method (i.e., settings.method is still method_t::Concurrent/default) instead of failing: detect float precision and if settings.method == method_t::Concurrent, set settings.method = method_t::PDLP and emit a log/info message about the override; keep the existing cuopt_expects check only for the case where the user explicitly set an incompatible method (e.g., DualSimplex/Barrier) so an error is still raised for explicit invalid combinations. Ensure you reference and modify the code paths around solve_lp(), settings.method, and the float-precision branch that currently calls run_pdlp().

No I don't want users thinking they are using concurrent while in FP32. They should realize that they should, and will, use PDLP only

@Kh4ster, understood! The current validation approach is the right design choice. Requiring users to explicitly set method=PDLP when using FP32 precision ensures clarity and avoids confusion about which solver method is actually being used. The validation error at lines 1151-1156 correctly enforces this requirement and helps users make informed decisions about their solver configuration.

🧠 Learnings used

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 602 File: cpp/src/linear_programming/solve.cu:732-742 Timestamp: 2025-12-04T20:09:09.264Z Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Learnt from: tmckayus Repo: NVIDIA/cuopt PR: 819 File: cpp/src/linear_programming/solve.cu:1338-1366 Timestamp: 2026-02-02T17:16:54.629Z Learning: In `cpp/src/linear_programming/solve.cu`, the remote vs. local solve routing intentionally checks only `is_remote_execution_enabled()` without checking `force_gpu_memory()` or backend type. This design supports remote solving of GPU-resident problems, enabling workflows where users manipulate problems/solutions on a local GPU but execute the solve on a remote machine with a better GPU.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 899 File: cpp/src/mip_heuristics/solve.cu:169-193 Timestamp: 2026-02-24T14:37:15.024Z Learning: In cuOpt's MIP solver (cpp/src/mip_heuristics/solve.cu and related files), the objective_scaling_factor sign encodes the user problem's objective sense: if >= 0, the user problem is minimization; if < 0, the user problem is maximization (which is internally converted to minimization by negating the objective). Therefore, checking the sign of objective_scaling_factor is a valid way to determine the problem direction and set appropriate "no bound" values for callbacks.

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 814 File: cpp/src/dual_simplex/branch_and_bound.hpp:154-159 Timestamp: 2026-01-29T04:25:05.434Z Learning: In cpp/src/dual_simplex/branch_and_bound.hpp and branch_and_bound.cpp, the mutex_original_lp_ protects original_lp_ specifically during the root cut pass phase. The public API set_new_solution() can be called from external threads during this phase while the main thread is adding cuts/slacks to original_lp_. Functions like best_first_thread, exploration_ramp_up, and plunge_from execute after the cut passes complete (after the OpenMP parallel barrier) when original_lp_ is fixed, so they don't need mutex protection for their reads.

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 814 File: cpp/src/dual_simplex/cuts.cpp:1278-1278 Timestamp: 2026-01-30T23:34:26.318Z Learning: In the cuOpt dual simplex solver (cpp/src/dual_simplex/cuts.cpp and related files), slack variable coefficients in the constraint matrix are always exactly 1.0 or -1.0 (not floating-point approximations), so exact equality checks like `assert(std::abs(lp.A.x[col_start]) == 1.0)` are appropriate and should not be flagged as requiring epsilon tolerances.

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 721 File: cpp/src/linear_programming/optimization_problem.cu:106-205 Timestamp: 2026-01-29T11:25:28.213Z Learning: The function `csr_matrices_equivalent_with_permutation` in `cpp/src/linear_programming/optimization_problem.cu` is only used in CI testing with known inputs, so strict problem size validation is not required.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 840 File: python/cuopt/cuopt/linear_programming/solver_settings/solver_settings.py:178-180 Timestamp: 2026-02-09T17:57:51.918Z Learning: In cuOpt's Python solver settings (python/cuopt/cuopt/linear_programming/solver_settings/solver_settings.py), the set_optimality_tolerance() method should only update the six optimality-related tolerances (absolute_dual_tolerance, relative_dual_tolerance, absolute_primal_tolerance, relative_primal_tolerance, absolute_gap_tolerance, relative_gap_tolerance) and must not modify infeasibility detection thresholds (primal_infeasible_tolerance, dual_infeasible_tolerance), which require stricter default values for correct infeasibility detection.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Learnt from: rg20 Repo: NVIDIA/cuopt PR: 816 File: cpp/src/math_optimization/solver_settings.cu:99-102 Timestamp: 2026-02-05T15:54:25.780Z Learning: In the PSLP integration (PR `#816`), the CUOPT_PRESOLVE parameter was intentionally changed from a boolean to an integer enum (presolver_t) with values Default/None/Papilo/PSLP. This is a deliberate breaking change; the CLI expects integers, and boolean compatibility was intentionally removed.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Ensure variables and constraints are accessed from the correct problem context (original vs presolve vs folded vs postsolve); verify index mapping consistency across problem transformations

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 798 File: cpp/src/utilities/models/fj_predictor/quantize.cpp:1149-1161 Timestamp: 2026-01-26T14:17:25.105Z Learning: The files in cpp/src/utilities/models/*/quantize.cpp (fj_predictor, cpufj_predictor, pdlp_predictor, dualsimplex_predictor, bounds_strengthening_predictor) are autogenerated model artifacts and should not have manual code changes suggested; any improvements should target the model generation tooling or calling code validation layer.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

coderabbitai · 2026-02-27T15:12:01Z

cpp/tests/linear_programming/pdlp_test.cu

 #include <cuopt/linear_programming/pdlp/solver_settings.hpp>
 #include <cuopt/linear_programming/pdlp/solver_solution.hpp>
 #include <cuopt/linear_programming/solve.hpp>
+#include <mip_heuristics/mip_constants.hpp>


🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail # 1) Locate macro definitions/usages echo "=== Searching for macro definitions and usages ===" rg -n -C3 '\b(MIP_INSTANTIATE_FLOAT|PDLP_INSTANTIATE_FLOAT)\b' --type-add 'cpp:*.cu' --type-add 'cpp:*.cuh' --type-add 'cpp:*.hpp' --type-add 'cpp:*.h' echo "" echo "=== Checking for float template instantiations ===" rg -n -C3 'solve_lp<[^>]*float|pdlp_solver_settings_t<[^>]*float|mps_data_model_t<int,\s*float>' --type-add 'cpp:*.cu' --type-add 'cpp:*.cuh' --type-add 'cpp:*.hpp' --type-add 'cpp:*.h' echo "" echo "=== Examining guard usage in pdlp_test.cu ===" rg -n -C5 '#if.*MIP_INSTANTIATE_FLOAT|#if.*PDLP_INSTANTIATE_FLOAT' cpp/tests/linear_programming/pdlp_test.cu

Repository: NVIDIA/cuopt

Length of output: 46827

🏁 Script executed:

# Check how PDLP_INSTANTIATE_FLOAT is defined and used globally grep -r "PDLP_INSTANTIATE_FLOAT" cpp/src/mip_heuristics/mip_constants.hpp | head -1

Repository: NVIDIA/cuopt

Length of output: 88

Refactor the float test gate to check only PDLP instantiation.

The guard #if MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT at lines 50 and 1930 is overly permissive. PDLP float tests will compile if either MIP or PDLP is instantiated for float. This can accidentally enable PDLP float tests in builds where only MIP float is instantiated. Change to #if PDLP_INSTANTIATE_FLOAT to ensure tests only compile when PDLP float instantiation is actually enabled.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cpp/tests/linear_programming/pdlp_test.cu` at line 23, Update the preprocessor guard that gates the PDLP float tests: replace the overly-broad conditional that uses MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT with a check that only uses PDLP_INSTANTIATE_FLOAT so PDLP float tests compile only when PDLP is instantiated for float; locate the conditional in pdlp_test.cu (the lines surrounding the existing `#if` MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT) and change it to `#if` PDLP_INSTANTIATE_FLOAT, leaving the rest of the test code unchanged.

In both cases the pdlp float32 tests should compile and run

@Kh4ster, thank you for the clarification! You're absolutely right—the guard #if MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT is correct and intentional. The PDLP float32 tests should indeed compile and run when either macro is defined, ensuring broader test coverage across different instantiation configurations. I've noted this for future reviews.

✏️ Learnings added

Learnt from: Kh4ster Repo: NVIDIA/cuopt PR: 910 File: cpp/tests/linear_programming/pdlp_test.cu:23-23 Timestamp: 2026-02-27T15:33:38.577Z Learning: In cpp/tests/linear_programming/pdlp_test.cu and similar PDLP test files, the preprocessor guard `#if MIP_INSTANTIATE_FLOAT || PDLP_INSTANTIATE_FLOAT` for float tests is intentional. PDLP float32 tests should compile and run when either MIP_INSTANTIATE_FLOAT or PDLP_INSTANTIATE_FLOAT is defined, not just when PDLP_INSTANTIATE_FLOAT is defined.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 721 File: cpp/src/linear_programming/optimization_problem.cu:106-205 Timestamp: 2026-01-29T11:25:28.213Z Learning: The function `csr_matrices_equivalent_with_permutation` in `cpp/src/linear_programming/optimization_problem.cu` is only used in CI testing with known inputs, so strict problem size validation is not required.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 899 File: cpp/src/mip_heuristics/solve.cu:169-193 Timestamp: 2026-02-24T14:37:15.024Z Learning: In cuOpt's MIP solver (cpp/src/mip_heuristics/solve.cu and related files), the objective_scaling_factor sign encodes the user problem's objective sense: if >= 0, the user problem is minimization; if < 0, the user problem is maximization (which is internally converted to minimization by negating the objective). Therefore, checking the sign of objective_scaling_factor is a valid way to determine the problem direction and set appropriate "no bound" values for callbacks.

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 602 File: cpp/src/linear_programming/solve.cu:732-742 Timestamp: 2025-12-04T20:09:09.264Z Learning: In cpp/src/linear_programming/solve.cu, the barrier solver does not currently return INFEASIBLE or UNBOUNDED status. It only returns OPTIMAL, TIME_LIMIT, NUMERICAL_ISSUES, or CONCURRENT_LIMIT.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*test*.{cpp,cu,py} : Test with free variables, singleton problems, and extreme problem dimensions near resource limits to validate edge case handling

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)

Learnt from: tmckayus Repo: NVIDIA/cuopt PR: 819 File: cpp/src/linear_programming/solve.cu:1338-1366 Timestamp: 2026-02-02T17:16:54.629Z Learning: In `cpp/src/linear_programming/solve.cu`, the remote vs. local solve routing intentionally checks only `is_remote_execution_enabled()` without checking `force_gpu_memory()` or backend type. This design supports remote solving of GPU-resident problems, enabling workflows where users manipulate problems/solutions on a local GPU but execute the solve on a remote machine with a better GPU.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Refactor code duplication in solver components (3+ occurrences) into shared utilities; for GPU kernels, use templated device functions to avoid duplication

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Verify correct problem size checks before expensive GPU/CPU operations; prevent resource exhaustion on oversized problems

Learnt from: aliceb-nv Repo: NVIDIA/cuopt PR: 527 File: cpp/src/mip/diversity/lns/rins.cu:167-175 Timestamp: 2025-10-22T14:25:22.899Z Learning: In MIP (Mixed Integer Programming) problems in the cuOPT codebase, `n_integer_vars == 0` is impossible by definition—MIP problems must have at least one integer variable. If there are no integer variables, it would be a pure Linear Programming (LP) problem, not a MIP problem.

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check that hard-coded GPU device IDs and resource limits are made configurable; abstract multi-backend support for different CUDA versions

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Identify assertions with overly strict numerical tolerances that fail on legitimate degenerate/edge cases (near-zero pivots, singular matrices, empty problems)

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*test*.{cpp,cu,py} : Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Check numerical stability: prevent overflow/underflow, precision loss, division by zero/near-zero, and use epsilon comparisons for floating-point equality checks

Learnt from: CR Repo: NVIDIA/cuopt PR: 0 File: .github/.coderabbit_review_guide.md:0-0 Timestamp: 2025-11-25T10:20:49.822Z Learning: Applies to **/*test*.{cpp,cu,py} : Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)

Learnt from: chris-maes Repo: NVIDIA/cuopt PR: 814 File: cpp/src/dual_simplex/cuts.cpp:1278-1278 Timestamp: 2026-01-30T23:34:26.318Z Learning: In the cuOpt dual simplex solver (cpp/src/dual_simplex/cuts.cpp and related files), slack variable coefficients in the constraint matrix are always exactly 1.0 or -1.0 (not floating-point approximations), so exact equality checks like `assert(std::abs(lp.A.x[col_start]) == 1.0)` are appropriate and should not be flagged as requiring epsilon tolerances.

Kh4ster · 2026-02-27T16:16:51Z

/ok to test 2cb9ce5

Kh4ster added 11 commits February 4, 2026 10:39

working fp32 support for PDLP no presolve no crossover

6e309df

updade run_pdlp to allow for fp32

b9ac363

Merge branch 'main' into pdlp_float32

c7f6a24

Merge branch 'main' into pdlp_float32

3d11bfd

support fp32 with presolve

c4e778e

implement and toggle mixed precision

6a72668

Merge branch 'main' into mixed_precision_on

c134104

Merge branch 'main' into mixed_precision_on

76eef99

cleanup and doc

ea37dd2

update doc

f8f673a

style cleanup

64c5457

Kh4ster requested review from a team as code owners February 27, 2026 14:08

Kh4ster requested review from aliceb-nv, kaatish and tmckayus February 27, 2026 14:08

Kh4ster self-assigned this Feb 27, 2026

Kh4ster removed the request for review from kaatish February 27, 2026 14:10

Kh4ster added feature request New feature or request non-breaking Introduces a non-breaking change pdlp labels Feb 27, 2026

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

Kh4ster added 2 commits February 27, 2026 15:41

use || for all pdlp float instanciation

342ba32

address PR comments

366fd6a

coderabbitai bot reviewed Feb 27, 2026

View reviewed changes

anandhkb added this to the 26.04 milestone Feb 27, 2026

add forgotten parameter

abf13f3

Merge branch 'main' into pdlp_fp32_and_mixed_precision_support

2cb9ce5

Conversation

Kh4ster commented Feb 27, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

FP32 Precision Mode

Mixed Precision SpMV

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 27, 2026

Uh oh!

coderabbitai bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Kh4ster commented Feb 27, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kh4ster Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kh4ster Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Kh4ster commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kh4ster commented Feb 27, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 27, 2026 •

edited

Loading

coderabbitai bot Feb 27, 2026 •

edited

Loading

coderabbitai bot Feb 27, 2026 •

edited

Loading