Add CUDA 13 support and remove CUDA 11 support #1302

ptheywood · 2025-08-07T12:47:23Z

Update the supported CUDA versions from CUDA 11.2-12.x, to 12.0-13.x.

Python Wheels provided for CUDA 12.0+ and 13.0+ on Windows, and CUDA 12.4+ and 13.0+ on Windows.

Depends on #1150

ptheywood · 2025-08-07T17:03:58Z

Several MSVC issues to resolve:

The windows cuda installation script does not (correclty) error if invalid subpackages are requested
CUDA 13 requires additional subpackages for cuda 13, some or all of:
```
"crt";
"nvptxcompiler";
"nvvm";
"nsight_vse";
```

CUDA 13 on windows seems to hit several compiler errors in internal cuda headers (curand_poisson.h) due to invalid pragmas.

We might be able to suppress (some of) the bad pragma issues with a suppression, but should probably report this (I must assume this has already been encountered tbh)

2025-08-07T16:42:55.2327474Z      1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(642): error #20199-D: unrecognized #pragma in device code [C:\a\FLAMEGPU2\FLAMEGPU2\build\FLAMEGPU\flamegpu.vcxproj]
2025-08-07T16:42:55.2469229Z                __pragma(warning(push)) __pragma(warning(disable:4996)) __pragma(nv_diagnostic push) __pragma(nv_diag_suppress 1444)
2025-08-07T16:42:55.2766327Z                         ^
2025-08-07T16:42:55.2803114Z          
2025-08-07T16:42:55.3120202Z          Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

I'll pick this up once I return from annual leave

ptheywood · 2025-08-18T14:18:06Z

The curand warnings can be suppressed via CMake, and I've narrowed down the subpackges which are required.

CI updates need expanding to other workflows still (only updated the minimal set to avoid significant CI builds for now) and manual c++ and python test suite execution required with CUDA 13 on linux and windows, but will likely hold off until I've resolved #1150.

Readme etc will also need updating still. I'll update the list in the original post.

ptheywood · 2025-09-02T10:00:44Z

Some new C++20 warnings to address (linux)

/home/ptheywood/code/flamegpu/FLAMEGPU2/src/flamegpu/model/ModelData.cpp: In member function ‘bool flamegpu::ModelData::operator==(const flamegpu::ModelData&) const’:
/home/ptheywood/code/flamegpu/FLAMEGPU2/src/flamegpu/model/ModelData.cpp:105:37: warning: C++20 says that these are ambiguous, even though the second is reversed:
  105 |         && *dependencyGraph == *rhs.dependencyGraph) {
      |                                     ^~~~~~~~~~~~~~~
In file included from /home/ptheywood/code/flamegpu/FLAMEGPU2/src/flamegpu/model/ModelData.cpp:17:
/home/ptheywood/code/flamegpu/FLAMEGPU2/include/flamegpu/model/DependencyGraph.h:40:10: note: candidate 1: ‘bool flamegpu::DependencyGraph::operator==(const flamegpu::DependencyGraph&)’
   40 |     bool operator==(const DependencyGraph& rhs);
      |          ^~~~~~~~
/home/ptheywood/code/flamegpu/FLAMEGPU2/include/flamegpu/model/DependencyGraph.h:40:10: note: candidate 2: ‘bool flamegpu::DependencyGraph::operator==(const flamegpu::DependencyGraph&)’ (reversed)
/home/ptheywood/code/flamegpu/FLAMEGPU2/include/flamegpu/model/DependencyGraph.h:40:10: note: try making the operator a ‘const’ member function

And under msvc others + an error in jitify 2 due to use of std::result_of which was removed from c++20, so I'll have to fix that too. Unsure why this error doesnt' trigger under linux even in c++20 mode.

ptheywood · 2025-09-03T10:24:54Z

Windows VS2022 + CUDA 13.0 c++ tests passed on my 3060ti. Should probably re-run in the future when this PR is tidied up and merge-able

[==========] 1133 tests from 88 test suites ran. (166386 ms total)

ptheywood · 2025-09-03T12:15:33Z

Windows CUDA 13 C++20 Pyflamegpu (swig >= 4.1.0) pytests all pass

676 passed, 12 skipped in 422.62s (0:07:02)

ptheywood · 2025-09-16T09:28:59Z

C++20 is mostly working, just an outstanding issue with pyflamegpu on windows with CUDA 12.0 in c++20 mode.

I need to jump into windows and install CUDA 12.0 to debug this really.

2025-09-04T13:15:32.4774920Z ##[error]     3>C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\xutility(1838): error : variable type "std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>" in constexpr function is not a literal type [D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu_swig.vcxproj]
2025-09-04T13:15:32.4782627Z                    detected during:
2025-09-04T13:15:32.4784128Z                      instantiation of class "std::reverse_iterator<_BidIt> [with _BidIt=std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>]" 
2025-09-04T13:15:32.4784744Z          (857): here
2025-09-04T13:15:32.4785587Z                      instantiation of "decltype(auto) std::ranges::_Iter_move::_Cpo::operator()(_Ty &&) const [with _Ty=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>> &]" 
2025-09-04T13:15:32.4786296Z          (1313): here
2025-09-04T13:15:32.4787099Z                      instantiation of "const __nv_bool std::_Is_ranges_random_iter_v [with _Iter=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>>]" 
2025-09-04T13:15:32.4787728Z          (1633): here
2025-09-04T13:15:32.4788577Z                      instantiation of "void std::advance(_InIt &, _Diff) [with _InIt=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>>, _Diff=unsigned long long]" 
2025-09-04T13:15:32.4789477Z          D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu\flamegpuPYTHON_wrap.cxx(4801): here
2025-09-04T13:15:32.4790567Z                      instantiation of "Sequence *swig::getslice(const Sequence *, Difference, Difference, Py_ssize_t) [with Sequence=std::list<flamegpu::StepLogFrame, std::allocator<flamegpu::StepLogFrame>>, Difference=ptrdiff_t]" 
2025-09-04T13:15:32.4791530Z          D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu\flamegpuPYTHON_wrap.cxx(8307): here
2025-09-04T13:15:32.4791880Z          
2025-09-04T13:15:32.4930147Z ##[error]     3>C:\Program Files\Microsoft Visual Studio\2022\Enterprise\VC\Tools\MSVC\14.44.35207\include\xutility(1840): error : no instance of function template "std::ranges::_Iter_move::_Cpo::operator()" matches the argument list [D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu_swig.vcxproj]
2025-09-04T13:15:32.4932630Z                      argument types are: (<error-type>)
2025-09-04T13:15:32.4933307Z                      object type is: const std::ranges::_Iter_move::_Cpo
2025-09-04T13:15:32.4933868Z                    detected during:
2025-09-04T13:15:32.4935177Z                      instantiation of class "std::reverse_iterator<_BidIt> [with _BidIt=std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>]" 
2025-09-04T13:15:32.4936167Z          (857): here
2025-09-04T13:15:32.4937276Z                      instantiation of "decltype(auto) std::ranges::_Iter_move::_Cpo::operator()(_Ty &&) const [with _Ty=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>> &]" 
2025-09-04T13:15:32.4937974Z          (1313): here
2025-09-04T13:15:32.4938729Z                      instantiation of "const __nv_bool std::_Is_ranges_random_iter_v [with _Iter=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>>]" 
2025-09-04T13:15:32.4939598Z          (1633): here
2025-09-04T13:15:32.4940415Z                      instantiation of "void std::advance(_InIt &, _Diff) [with _InIt=std::reverse_iterator<std::_List_const_iterator<std::_List_val<std::_List_simple_types<flamegpu::StepLogFrame>>>>, _Diff=unsigned long long]" 
2025-09-04T13:15:32.4941325Z          D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu\flamegpuPYTHON_wrap.cxx(4801): here
2025-09-04T13:15:32.4942381Z                      instantiation of "Sequence *swig::getslice(const Sequence *, Difference, Difference, Py_ssize_t) [with Sequence=std::list<flamegpu::StepLogFrame, std::allocator<flamegpu::StepLogFrame>>, Difference=ptrdiff_t]" 
2025-09-04T13:15:32.4943324Z          D:\a\FLAMEGPU2\FLAMEGPU2\build\swig\python\pyflamegpu\flamegpuPYTHON_wrap.cxx(8307): here
2025-09-04T13:15:32.4943676Z

Reproduced locally with CUDA 12.0 on windows.

Unsurpsingly (And unfortunately) bumping swig to 4.2.1 or 4.3.0 does not resolve this issue (seeing as it's fine in newer CUDA releases).

The issue occurs within msvc's reverse_iterator in xutility, on the return statement of iter_move.

#if _HAS_CXX20
    _NODISCARD friend constexpr iter_rvalue_reference_t<_BidIt> iter_move(const reverse_iterator& _It)
        noexcept(is_nothrow_copy_constructible_v<_BidIt> && noexcept(_RANGES iter_move(--_STD declval<_BidIt&>()))) {
        auto _Tmp = _It.current;
        --_Tmp;
        return _RANGES iter_move(_Tmp);
    }

When using Swig 4.3.0, this is hit during the std::distance call within SwigPyIterator::distance, called through a chain of methods/objects ultimately from wrapping RunLogMap (although this error would likely occur for other similar invocations, this is just the first one).

template<typename OutIterator>
  class SwigPyIterator_T :  public SwigPyIterator

...
    ptrdiff_t distance(const SwigPyIterator &iter) const
    {
      const self_type *iters = dynamic_cast<const self_type *>(&iter);
      if (iters) {
	return std::distance(current, iters->get_current());
      } else {
	throw std::invalid_argument("bad iterator type");
      }
    }    
...

SWIGINTERN PyObject *_wrap_RunLogMap_rbegin(PyObject *self, PyObject *args) {

...
resultobj = SWIG_NewPointerObj(swig::make_output_iterator(static_cast< const std::map< unsigned int,flamegpu::RunLog >::reverse_iterator & >(result)),
    swig::SwigPyIterator::descriptor(),SWIG_POINTER_OWN);
  return resultobj;
...
}

As using a newer CUDA compiler does not encouter this issue, it's most likely a compiler / stdlib mismatch issue that swig is just exposign to us, we could:

Not switch to c++20 😢
Bump the minimum CUDA version on windows to 12.x for pyflamegpu support (also not ideal, but compiler bugs are compiler bugs).
- CUDA 13.0 on CI is OK
- CUDA 12.9 on CI is OK
- CUAD 12.4 locally is OK
- CUDA 12.3 locally is unhappy.
Try some (probably horrible) workarounds (i.e. build the swig wrapped version in c++17 on windows with older CUDA versions, which for now atleast should probably work as we don't return any c++20 objects and the same compiler should be getting used for abi compat.)

Manually forcing -std=c++17 for pyflamegpu on windows with CUDA 12.0 when flamegpu was built with -std=c++20 does currently build, link and produce a functional pyflamegpu.

However this is pretty dirty, would mean we can't use any c++20 features in our include/ header files (without guarding it out so that swig doesn't see it), and apparently it's non trivial to get CMake to set the flag correclty for that target with how c++20 is being set elsewhere...

ptheywood · 2025-09-16T12:58:05Z

Due to the above MSVC + CUDA < 12.4 + -std=c++20 std::ranges issues causing the above (and CI failures) we have 2 real options:

Don't upgrade to c++20 (:disappointed: but not the end of the world I suppose)
Upgrade to c++20, but bump our minimum pyflamegpu on Windows support to CUDA >= 12.4
- C++ with CUDA 12.0 would still be supported, just not pyflamegpu, which would mean 12.4 would be our windows wheels too. Windows users should keep their drivers more up to date than (hpc) linux
- 12.4 is also officially supported by recent MSVC, so this is probably the correct thing to do anyway even though it changes our wheel support once again, and complicates it w.r.t linux.

I'm becoming more and more inclined to just bump our windows pyflamegpu builds to 12.4 + 13.0 (or even 13.0 only) as so much time has been spent recently fighting Windows CI that maintaining support for older versions is becoming more and more of a time sink, and will probably just break at some random point in the future for no good reason.

ptheywood · 2025-09-16T14:30:37Z

For now, I've bumped windows wheel support to CUDA >= 12.4, and updated CI accordingly including an extra windows job checking non-python CUDA 12.0 support on windows.

ptheywood · 2025-09-16T16:43:28Z

I belive this is now ready, subject to a tweak and rebase after the visualisation PR is merged and is still blocked by #1150.

Unfortunately this does complicate our python wheel / support story which I had been trying to avoid.

I'm open to us only having a single CUDA version supported for python binary wheels on Windows which would slightly simplify things? (colab still requires CUDA 12 for linux (12.4 drivers, but on tesla hardware with 12.5 installed currently)

I've manually tested c++ and python test suites on both linux and windows, and checked visualiastion still runs on linux (C++ and python).

…support

…not quiet. Caches the most recently found version to emit a warning in case the minimum version has been increased.

…DA 13.0 on Windows

… second is reversed'

…in JSONStateReader.cu

…gle char to a std::string

…equired by c++20 pyflamegpu

…ssions on Windows CMake uses the nvcc version, which is the same for 13.0 and 13.0 update 1, so the suppression must be applied for all 13.0 releases

…t due to c++20 errors

… job without python building

…24b51dd5dd2 (merged cuda 13 support)

…. Some message tweaks too

ptheywood force-pushed the cuda-13 branch from 799e3b5 to d362071 Compare August 7, 2025 12:48

ptheywood force-pushed the jitify2 branch from 8104d7e to d0c45b7 Compare August 7, 2025 12:52

ptheywood force-pushed the cuda-13 branch from d362071 to 193a5e6 Compare August 7, 2025 12:55

ptheywood force-pushed the cuda-13 branch 2 times, most recently from 4a8f7ae to 7cb7825 Compare August 18, 2025 14:15

ptheywood force-pushed the jitify2 branch from d0c45b7 to 326165f Compare September 2, 2025 10:40

ptheywood force-pushed the cuda-13 branch from a1a7274 to 4c92a90 Compare September 2, 2025 10:40

ptheywood force-pushed the cuda-13 branch from a082167 to 7e02522 Compare September 3, 2025 11:01

ptheywood force-pushed the cuda-13 branch 2 times, most recently from 281f8d7 to a60806e Compare September 3, 2025 13:56

ptheywood force-pushed the jitify2 branch from 326165f to 21b69f2 Compare September 4, 2025 12:40

ptheywood force-pushed the cuda-13 branch from a60806e to cc3db1c Compare September 4, 2025 12:41

ptheywood force-pushed the cuda-13 branch 2 times, most recently from 9e03b6c to b17512b Compare September 16, 2025 14:08

ptheywood mentioned this pull request Sep 16, 2025

Make use of c++20 features #1309

Open

5 tasks

ptheywood mentioned this pull request Sep 16, 2025

CUDA 13 and C++20 - FLAMEGPU/FLAMEGPU2-docs#187

Merged

ptheywood mentioned this pull request Sep 16, 2025

Replace RapidJSON with nlohmann::json #1277

Merged

ptheywood mentioned this pull request Sep 24, 2025

CMake 4 support #1315

Merged

6 tasks

ptheywood force-pushed the jitify2 branch from 21b69f2 to 449c0a5 Compare September 29, 2025 12:02

ptheywood mentioned this pull request Sep 29, 2025

Migrate to Jitify2 #1150

Merged

12 tasks

ptheywood force-pushed the cuda-13 branch from b17512b to e9dbf3a Compare September 29, 2025 13:26

Base automatically changed from jitify2 to master October 1, 2025 11:57

ptheywood added 25 commits October 1, 2025 12:59

CMake: Bump required CCCL to v3. This will implicitly remove CUDA 11 …

b7320ab

…support

CMake: Do not re-find CCCL if tagets exist, as find_package(CCCL) is …

4becd03

…not quiet. Caches the most recently found version to emit a warning in case the minimum version has been increased.

CUDA 13: find the cccl include directory for jitify/nvrtc use

ef3c1fb

CI: Add CUDA 13 to list of known windows installers

11cc482

CI: Install additional required CUDA >= 13 subpackages

01f1e6d

CI: Add CUDA 13 to 'regular' ci workflows

6674264

Remove CUDA 11.x from 'regular' CI

b026f22

CUDA 13: Suppress unrecognised pragma in device code warnings with CU…

7cf4961

…DA 13.0 on Windows

CI: Update cuda versions in draft-release workflow to 12.0+

5ba90c9

CMake: increase minimum required CMake to 3.25.2 for CUDA C++20 suport

c6b5053

Readme: specify that a c++20 compiler is required

0535446

c++20: Switch from c++17 to c++20

36936d7

CI: Bump minimum gcc to gcc 10 for c++20

7bc7e52

c++20: suppress 'C++20 says that these are ambiguous, even though the…

c656e2c

… second is reversed'

c++20: Use static_pointer_cast to fix a msvc c++20 compilation error …

013f774

…in JSONStateReader.cu

c++20: Suppress GCC 12 -Wrestrict false-positives for appending a sin…

58f4ea3

…gle char to a std::string

c++20: Update minimum swig to 4.1.0 for c++20 compatibility

3644851

CMake: Ensure --expt-relaxed-constexpr is a pulic property as it is r…

af4c713

…equired by c++20 pyflamegpu

Readme: CUDA version update

14091bf

CMake: CUDA 13.0 Update 1 does not require unrecongised pragma suppre…

70285fb

…ssions on Windows CMake uses the nvcc version, which is the same for 13.0 and 13.0 update 1, so the suppression must be applied for all 13.0 releases

CI: Add CUDA 13.0.1 to known list of windows CUDA verisons

978db01

Windows: (soft) Require CUDA >= 12.4 on windows for pyflamegpu suppor…

46bb70c

…t due to c++20 errors

CI Windows: Use CUDA >= 12.4 for python builds, and add an extra 12.0…

78d36b7

… job without python building

CMake: Bump Visualisation dependency to ee91d5601e846f9f06bc44608e964…

945a44a

…24b51dd5dd2 (merged cuda 13 support)

Update cuda requirement to 12.0+ on linux and (soft) 12.4+ on Windows…

8436359

…. Some message tweaks too

ptheywood force-pushed the cuda-13 branch from e9dbf3a to 8436359 Compare October 1, 2025 12:23

CI: Drop window CUDA 12.0 from CI. May work but unsupported

ca002cc

ptheywood merged commit e902356 into master Oct 1, 2025
57 checks passed

ptheywood deleted the cuda-13 branch October 1, 2025 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CUDA 13 support and remove CUDA 11 support #1302

Add CUDA 13 support and remove CUDA 11 support #1302

Uh oh!

ptheywood commented Aug 7, 2025 •

edited

Loading

Uh oh!

ptheywood commented Aug 7, 2025

Uh oh!

ptheywood commented Aug 18, 2025

Uh oh!

ptheywood commented Sep 2, 2025

Uh oh!

ptheywood commented Sep 3, 2025

Uh oh!

ptheywood commented Sep 3, 2025

Uh oh!

ptheywood commented Sep 16, 2025 •

edited

Loading

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add CUDA 13 support and remove CUDA 11 support #1302

Add CUDA 13 support and remove CUDA 11 support #1302

Uh oh!

Conversation

ptheywood commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ptheywood commented Aug 7, 2025

Uh oh!

ptheywood commented Aug 18, 2025

Uh oh!

ptheywood commented Sep 2, 2025

Uh oh!

ptheywood commented Sep 3, 2025

Uh oh!

ptheywood commented Sep 3, 2025

Uh oh!

ptheywood commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

ptheywood commented Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ptheywood commented Aug 7, 2025 •

edited

Loading

ptheywood commented Sep 16, 2025 •

edited

Loading