Skip to content

CuOPT 25.12+ backward compatibility break: Requires nvJitLink 12.9.79+ but Databricks ML Runtime provides 12.4.127Β #739

@TavnerJC

Description

@TavnerJC

πŸ› Bug Report: CuOPT 25.12+ Fails to Load on Databricks Due to nvJitLink Version Requirement

Summary

CuOPT 25.12+ introduces a backward compatibility breaking change that prevents it from running on Databricks ML Runtime 16.4 (and likely other managed environments) due to a hard dependency on nvidia-nvjitlink-cu12 >= 12.9.79. This blocks all Databricks users from using GPU-accelerated routing optimization with CuOPT.

Environment

Platform:

  • Databricks ML Runtime 16.4 (GPU)
  • Databricks Serverless GPU Compute

Hardware:

  • GPU: NVIDIA A10G (23028 MiB, Compute Capability 8.6)
  • Architecture: x86_64

Software Versions:

  • CuOPT: 25.12.0 (latest)
  • CUDA Runtime: 12.6
  • CUDA Driver: 535.161.07
  • Python: 3.12.3
  • nvidia-nvjitlink-cu12: 12.4.127 (provided by Databricks, unfixable by users)
  • Required by CuOPT: >= 12.9.79

Databricks Runtime Components:

  • nvidia-cuda-runtime-cu12: 12.9.79
  • nvidia-cublas-cu12: 12.9.1.4
  • nvidia-cusolver-cu12: 11.7.5.82
  • nvidia-cudnn-cu12: 9.1.0.70

Issue Description

When attempting to install and use CuOPT 25.12+ on Databricks ML Runtime 16.4, the library fails to load with the following error:

RuntimeWarning: Failed to load libcuopt library: libcuopt.so.
Error: /local_disk0/.ephemeral_nfs/envs/pythonEnv-.../lib/python3.12/site-packages/libcuopt/lib64/../../nvidia/cusolver/lib/libcusolver.so.11:
undefined symbol: cublasSetEnvironmentMode, version libcublas.so.12.
Falling back to relying on system loader. cuOpt functionality may be unavailable.

Root Cause

Breaking Change in CuOPT 25.12+:

  • CuOPT 25.12+ now requires nvidia-nvjitlink-cu12 >= 12.9.79 (based on testing and error analysis)
  • Previous CuOPT versions worked with older nvJitLink versions

Databricks Environment Limitation:

  • Databricks ML Runtime 16.4 provides nvidia-nvjitlink-cu12 12.4.127
  • This is a managed runtime package that users cannot upgrade
  • Attempting to upgrade via pip fails due to runtime conflicts

Impact:

  • Complete blockage of CuOPT functionality on Databricks
  • Affects all Databricks ML Runtime 16.x users
  • Affects Databricks Serverless GPU Compute users

Reproduction Steps

On Databricks ML Runtime 16.4:

  1. Create a Databricks cluster with ML Runtime 16.4 (GPU)
  2. Install CuOPT:
    %pip install --extra-index-url=https://pypi.nvidia.com cuopt-server-cu12 cuopt-sh-client
    dbutils.library.restartPython()
    3. Attempt to import and use CuOPT:
    from cuopt import routing
    4. Result: RuntimeWarning about failed library load, CuOPT is non-functional

Verify nvJitLink Version:

import subprocess
result = subprocess.run(["pip", "show", "nvidia-nvjitlink-cu12"], capture_output=True, text=True)
print(result.stdout)Output:

from cuopt import routingFails):**

%pip install --upgrade nvidia-nvjitlink-cu12>=12.9.79Result: Version conflicts with Databricks runtime dependencies, installation fails or reverts

Expected Behavior

Option 1: Backward Compatibility

  • CuOPT 25.12+ should maintain backward compatibility with nvJitLink 12.4.x
  • Or provide a compatibility layer / fallback mechanism

Option 2: Clear Version Requirements

  • Document minimum nvJitLink version requirement prominently in installation docs
  • Add runtime version check with clear error message
  • Provide alternative installation instructions for older environments

Option 3: Version-Specific Packages

  • Offer CuOPT builds for different nvJitLink versions
  • e.g., cuopt-server-cu12-nvjitlink124 and cuopt-server-cu12-nvjitlink129

Actual Behavior

  • CuOPT 25.12+ silently fails to load
  • Error message is cryptic (undefined symbol: cublasSetEnvironmentMode)
  • No clear indication that nvJitLink version is the issue
  • No workaround available for users on managed environments

Impact Assessment

Severity: πŸ”΄ Critical - Complete functionality loss

Affected Users:

  • All Databricks ML Runtime 16.x users
  • Databricks Serverless GPU Compute users
  • Other managed GPU environments with nvJitLink 12.4.x

Business Impact:

  • Blocks adoption of CuOPT for Databricks routing optimization workloads
  • Forces users to choose between:
    • Using Databricks (but no CuOPT)
    • Using CuOPT (but not on Databricks)
  • Impacts Databricks Routing Accelerator integration

User Time Impact:

  • 2-4 hours wasted per user attempting installation and debugging
  • No clear error message makes root cause difficult to identify

Proposed Solutions

Short-Term (Immediate):

  1. Document the requirement prominently:

    • Add nvJitLink version requirement to installation docs
    • Update release notes for 25.12.0
    • Add compatibility matrix to README
  2. Improve error message:

    • Add runtime check for nvJitLink version during library load
    • Provide clear error message:

import subprocess
result = subprocess.run(["pip", "show", "nvidia-nvjitlink-cu12"], capture_output=True, text=True)
print(result.stdout)on path in release notes

Medium-Term:

  1. Restore backward compatibility:

    • Investigate if nvJitLink 12.4.x support can be maintained
    • Use runtime detection and conditional code paths if needed
  2. Version-specific builds:

    • Provide separate builds for different nvJitLink versions
    • Allow users to install appropriate version for their environment

Long-Term:

  1. Coordinate with platform providers:
    • Work with Databricks to upgrade nvJitLink in ML Runtime 17.0+
    • Establish minimum version requirements for supported platforms

Workarounds

Currently, users have two unsatisfactory options:

Option A: Use OR-Tools (CPU-based)
%pip install ortools
from ortools.constraint_solver import routing_enums_pb2
from ortools.constraint_solver import pywrapcpPros: Works on all Databricks runtimes
Cons: CPU-only, significantly slower for large problems

Option B: Wait for Databricks ML Runtime 17.0+
Expected to include nvJitLink >= 12.9.79, but release date unknown

Option C: Use older CuOPT version (if available)
CuOPT 25.10.x or earlier may work with nvJitLink 12.4.127 (needs verification)

Detection & Validation

We've developed an automatic detection tool that identifies this issue:

CUDA Healthcheck Tool for Databricks: https://github.com/TavnerJC/cuda-healthcheck-on-databricks

Usage:
%pip install git+https://github.com/TavnerJC/cuda-healthcheck-on-databricks.git
dbutils.library.restartPython()

from cuda_healthcheck import CUDADetector
detector = CUDADetector()
env = detector.detect_environment()

Automatically detects CuOPT incompatibility and provides guidanceValidation Report: https://github.com/TavnerJC/cuda-healthcheck-on-databricks/blob/main/NOTEBOOK1_VALIDATION_SUCCESS.md

Additional Context

Testing:

Related Issues:

CuOPT Version Info:
import cuopt
print(cuopt.version) # 25.12.0nvJitLink Info:
pip show nvidia-nvjitlink-cu12

Name: nvidia-nvjitlink-cu12

Version: 12.4.127

Location: /databricks/python3/lib/python3.12/site-packages### Questions for NVIDIA CuOPT Team

  1. Was the nvJitLink >= 12.9.79 requirement intentional in 25.12.0?
  2. Can backward compatibility with nvJitLink 12.4.x be restored?
  3. Is there a CuOPT version that supports nvJitLink 12.4.x?
  4. Are there plans to provide version-specific builds?
  5. What is the recommended path forward for Databricks users?

References

System Information

Click to expand full environment details

Databricks ML Runtime 16.4

Python: 3.12.3
CUDA Runtime: 12.6
CUDA Driver: 535.161.07
GPU: NVIDIA A10G
Memory: 23028 MiB
Compute Capability: 8.6

CUDA Components

nvidia-cuda-runtime-cu12: 12.9.79
nvidia-cublas-cu12: 12.9.1.4
nvidia-cusolver-cu12: 11.7.5.82
nvidia-cudnn-cu12: 9.1.0.70
nvidia-nvjitlink-cu12: 12.4.127 ← THE PROBLEM

CuOPT

cuopt-server-cu12: 25.12.0
cuopt-sh-client: 25.12.0


πŸ™ Request

This issue blocks GPU-accelerated routing optimization for all Databricks users. We kindly request:

  1. Acknowledgment of this backward compatibility issue
  2. Guidance on the recommended path forward
  3. Timeline for a fix or workaround
  4. Documentation updates to prevent future users from encountering this

Thank you for your consideration and for developing CuOPT!


Submitted by: TavnerJC
Detection Tool: https://github.com/TavnerJC/cuda-healthcheck-on-databricks

Metadata

Metadata

Assignees

Labels

awaiting responseThis expects a response from maintainer or contributor depending on who requested in last comment.bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions