V2.8.0 #409

mgorny · 2025-08-07T19:10:35Z

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

Signed-off-by: Michał Górny <[email protected]>

conda-forge-admin · 2025-08-07T19:12:13Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

ℹ️ The magma output has been superseded by libmagma-devel.
ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

_{This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/17876818413. Examine the logs at this URL for more detail.}

mgorny · 2025-08-08T12:20:50Z

Ok, so issues so far:

Missing pyyaml test dependency.
A few tests are segfaulting in CUDA builds.

Windows can't find Python:

Could NOT find Python3: Found unsuitable major version ".=", but required
major version is exact version "3"

h-vetinari · 2025-08-08T19:32:09Z

The release notes make it sound like we'll need to double-check nvtx support as well

A downstream project using -DUSE_SYSTEM_NVTX will not be able to find NVTX3 or torch::nvtx3 via PyTorch's cmake/public/cuda.cmake. The downstream project now needs to explicitly find NVTX3 and torch::nvtx3 by implementing the same logic in PyTorch's cmake/Dependences.cmake.

mgorny · 2025-08-11T15:43:06Z

The release notes make it sound like we'll need to double-check nvtx support as well

A downstream project using -DUSE_SYSTEM_NVTX will not be able to find NVTX3 or torch::nvtx3 via PyTorch's cmake/public/cuda.cmake. The downstream project now needs to explicitly find NVTX3 and torch::nvtx3 by implementing the same logic in PyTorch's cmake/Dependences.cmake.

From what I understand, this means checking reverse dependencies.

mgorny · 2025-08-11T16:07:31Z

Wait, I'm reading the output wrong. Investigating further.

mgorny · 2025-08-11T16:55:27Z

Okay, I've learned more about Windows shell than I wanted to know, and I suspect delayed expansion did no work as expected. I've tried replacing %PY_VERSION_FULL% with !PY_VERSION_FULL!, and apparently this one works at least in Wine's implementation of cmd.

h-vetinari · 2025-08-12T06:37:32Z

Haha, the win+CUDA build manages to blow through the disk space

error: [Errno 28] No space left on device

despite using the largest runner alread

pytorch-cpu-feedstock/recipe/conda_build_config.yaml

Line 31 in d21fc6d

- cirun-azure-windows-4xlarge # [win]

But perhaps the "size" of that runner is only measured in CPUs/RAM, not storage? Could we extend that @aktech @wolfv?

mgorny · 2025-08-12T12:07:06Z

Well, at least the Python version issue was fixed. Also, looks like my idea of using pytest --forked won't work for CUDA tests.

mgorny · 2025-08-13T15:49:44Z

Le sigh, I've fetched the artifact and couldn't reproduce the segfaults locally. But I've noticed that our openblas+openmp constraint didn't work anymore. Let's try again.

By the way, I'm wondering if we should perhaps skip CPU tests in CUDA builds. I see they're some of the longest tests in CI, and I suppose it's sufficient that we test them in CPU builds.

h-vetinari · 2025-08-13T22:59:52Z

But I've noticed that our openblas+openmp constraint didn't work anymore.

Probably related to #407, which unfortunately isn't green either.

Signed-off-by: Michał Górny <[email protected]>

mgorny · 2025-09-15T17:11:05Z

Uh, so I guess CMake 4 breaks AArch64 builds? I'll try debugging that locally.

mgorny · 2025-09-15T17:39:23Z

BTW:

-------------------------------------------------------------------------------------------------
|                                                                                               |
|            WARNING: we strongly recommend enabling linker script optimization for ARM + CUDA. |
|            To do so please export USE_PRIORITIZED_TEXT_FOR_LD=1                               |
|                                                                                               |
-------------------------------------------------------------------------------------------------

Should we do that?

hmaarrfk · 2025-09-15T22:53:44Z

I thought we had a workaround with a comment in our script for this

h-vetinari · 2025-09-16T07:56:41Z

Windows+CUDA failing with

error: could not write to 'build\bdist.win-amd64\wheel\.\torch\lib\XNNPACK.lib': No space left on device

@aktech @wolfv, could we increase disk space on the windows agents?

h-vetinari · 2025-09-16T07:59:07Z

linux-64 has a single test failure that looks like a minor tolerance violation

2025-09-16T06:05:47.2890235Z =================================== FAILURES ===================================
2025-09-16T06:05:47.2891936Z _____________________ TestNN.test_layer_norm_backwards_eps _____________________
2025-09-16T06:05:47.2893254Z [gw0] linux -- Python 3.10.18 $PREFIX/bin/python3.10
2025-09-16T06:05:47.2893833Z 
2025-09-16T06:05:47.2894737Z self = <test_nn.TestNN testMethod=test_layer_norm_backwards_eps>
2025-09-16T06:05:47.2895403Z 
2025-09-16T06:05:47.2895755Z     @unittest.skipIf(not TEST_CUDA, "CUDA not available")
2025-09-16T06:05:47.2896595Z     def test_layer_norm_backwards_eps(self):
2025-09-16T06:05:47.2897281Z         dtype = torch.float
2025-09-16T06:05:47.2897894Z         m_x_n_list = [(3, 3), (5, 5), (11, 11), (55, 55),
2025-09-16T06:05:47.2898602Z                       (32, 32), (1024, 32), (1024, 1024),
2025-09-16T06:05:47.2899261Z                       (33, 33), (1025, 33), (1025, 1025),
2025-09-16T06:05:47.2899924Z                       (128 * 1024, 32), (32, 128 * 1024)]
2025-09-16T06:05:47.2900577Z         boolean = [True, False]
2025-09-16T06:05:47.2901311Z         combinations = itertools.product(boolean, repeat=2)
2025-09-16T06:05:47.2902177Z         for elementwise_affine, bias in combinations:
2025-09-16T06:05:47.2903096Z             for m, n in m_x_n_list:
2025-09-16T06:05:47.2903887Z                 x = torch.randn((m, n), dtype=dtype, requires_grad=True)
2025-09-16T06:05:47.2904716Z                 grad_output = torch.rand_like(x)
2025-09-16T06:05:47.2905526Z                 x_cuda = x.clone().detach().to("cuda").requires_grad_()
2025-09-16T06:05:47.2906461Z                 grad_output_cuda = grad_output.clone().detach().to("cuda")
2025-09-16T06:05:47.2907629Z                 ln = nn.LayerNorm(n, dtype=dtype, elementwise_affine=elementwise_affine, bias=bias)
2025-09-16T06:05:47.2909298Z                 ln_cuda = nn.LayerNorm(n, device="cuda", dtype=dtype, elementwise_affine=elementwise_affine, bias=bias)
2025-09-16T06:05:47.2910513Z                 ln_out = ln(x)
2025-09-16T06:05:47.2911121Z                 ln_out_cuda = ln_cuda(x_cuda)
2025-09-16T06:05:47.2911819Z                 ln_out.backward(grad_output)
2025-09-16T06:05:47.2912699Z                 ln_out_cuda.backward(grad_output_cuda)
2025-09-16T06:05:47.2913429Z                 if elementwise_affine:
2025-09-16T06:05:47.2914663Z >                   self.assertEqual(ln.weight.grad, ln_cuda.weight.grad, f"weight grad failed: {m=} {n=}", rtol=1e-4, atol=1e-4)
2025-09-16T06:05:47.2915985Z E                   AssertionError: Tensor-likes are not close!
2025-09-16T06:05:47.2916713Z E                   
2025-09-16T06:05:47.2917262Z E                   Mismatched elements: 1 / 32 (3.1%)
2025-09-16T06:05:47.2918340Z E                   Greatest absolute difference: 0.00084686279296875 at index (0,) (up to 0.0001 allowed)
2025-09-16T06:05:47.2919781Z E                   Greatest relative difference: 0.0012425975874066353 at index (0,) (up to 0.0001 allowed)
2025-09-16T06:05:47.2920868Z E                   weight grad failed: m=131072 n=32
2025-09-16T06:05:47.2921525Z E                   
2025-09-16T06:05:47.2922239Z E                   To execute this test, run the following from the base repo dir:
2025-09-16T06:05:47.2923340Z E                       python test/test_nn.py TestNN.test_layer_norm_backwards_eps
2025-09-16T06:05:47.2924306Z E                   
2025-09-16T06:05:47.2925119Z E                   This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2025-09-16T06:05:47.2925877Z 
2025-09-16T06:05:47.2926123Z test/test_nn.py:7238: AssertionError
2025-09-16T06:05:47.5812833Z ============================= slowest 50 durations =============================

mgorny · 2025-09-16T09:09:56Z

I thought we had a workaround with a comment in our script for this

Ah, sorry, indeed, I was doing a non-CUDA build and didn't notice it's there for CUDA.

aktech · 2025-09-16T09:47:10Z

@aktech @wolfv, could we increase disk space on the windows agents?

https://docs.cirun.io/reference/yaml#custom-disk-size-for-azure

The windows runners configuration needs to be updated with:

    extra_config:
      storageProfile:
        osDisk:
          diskSizeGB: 512

…5.09.18.12.55.27 Other tools: - conda-build 25.7.0 - rattler-build 0.47.0 - rattler-build-conda-compat 1.4.6

Signed-off-by: Michał Górny <[email protected]>

@aktech

Thanks to @aktech for the suggestion. Signed-off-by: Michał Górny <[email protected]>

mgorny · 2025-09-20T04:58:01Z

Uh, I accidentally rerendered over the Windows fix 🤦.

Let me read #413 up, in case I should change something before restarting.

h-vetinari

Thanks a lot for the persistence on this one! ❤️

Discussion about the pybind situation in #413 is still ongoing with the pybind maintainers, so a rebuild for v3 (or removing the dependence on pybind-abi completely) can be done in a follow-up.

mgorny · 2025-09-21T16:40:51Z

Thanks a lot for the persistence on this one! ❤️

No problem. I'm sorry it took this long — I have made more mistakes than i should have, notably failed to pin run dependencies early on, which would have saved me a lot of subsequent testing.

There's also the open question on how to deal with cudnn. I'm not even sure if this is something to report to PyTorch or to NVIDIA.

RoyiAvital · 2025-09-21T21:31:18Z

@mgorny , Appreciate your effort here for us the PyTorch on Windows users.
I hope the next ones are much easier (PyTorch 2.9 is one month away) to build.

h-vetinari

There's also the open question on how to deal with cudnn. I'm not even sure if this is something to report to PyTorch or to NVIDIA.

I think we could start at least with an issue on the cudnn feedstock, at least write down the things you remember from that debugging session somewhere, before it becomes just a haze. 😅

recipe/meta.yaml

h-vetinari · 2025-09-22T16:31:12Z

Windows CUDA builds are still failing to upload, done manually from the artefacts:

$ gh run download 17899219922 --repo conda-forge/pytorch-cpu-feedstock --name conda_artifacts_17899219922_win_64_channel_targetsconda-forge_maincu_hca575dce
$ unzip pytorch-cpu-feedstock_conda_artifacts_.zip
$ cd bld/win-64
$ rm current_repodata.json index.html repodata*
$ ls
libtorch-2.8.0-cuda128_mkl_ha34d6f4_300.conda       pytorch-2.8.0-cuda128_mkl_py312_h0850830_300.conda
pytorch-2.8.0-cuda128_mkl_py310_h0b8c608_300.conda  pytorch-2.8.0-cuda128_mkl_py313_hf206996_300.conda
pytorch-2.8.0-cuda128_mkl_py311_hd9a8a8a_300.conda  pytorch-gpu-2.8.0-cuda128_mkl_h2fd0c33_300.conda
$ ls | xargs anaconda upload
$ DELEGATE=h-vetinari
PACKAGE_VERSION=2.8.0
for package in libtorch pytorch pytorch-gpu; do
  anaconda copy --from-label main --to-label main --to-owner conda-forge ${DELEGATE}/${package}/${PACKAGE_VERSION}
done

mgorny · 2025-09-22T17:51:28Z

Presumably we'll need to restart that one failed AArch64 build — but I don't see the rerun button right now, so I guess it'll only appear when the other job is finished. Not sure if I can rerun it without rerunning the Windows build though.

h-vetinari · 2025-09-22T18:03:27Z

Not sure if I can rerun it without rerunning the Windows build though.

Just don't use the run-wide restart. It's possible to restart a single job

Zalnd · 2025-09-23T01:22:18Z

Hi all, I'm unsure if this is the right place to report this, but there's an issue with the file pytorch-2.8.0-cpu_mkl_py311_h98f00f5_100.conda in this release.

InvalidArchiveError("Error with archive C:\Anaconda\pkgs\pytorch-2.8.0-cpu_mkl_py311_h98f00f5_100.conda. You probably need to delete and re-download or re-create this file. Message was:\n\nfailed with erro)

Initial attempt at 2.8.0 bump

070ac2f

Signed-off-by: Michał Górny <[email protected]>

This comment was marked as outdated.

Sign in to view

traversaro mentioned this pull request Aug 22, 2025

Not intentional pytorch 2.8.0 upload on osx-arm64? #410

Closed

1 task

h-vetinari mentioned this pull request Aug 25, 2025

Triton 3.4.0 cannot be installed alongside pytorch-gpu 2.7 conda-forge/triton-feedstock#56

Open

1 task

traversaro mentioned this pull request Aug 26, 2025

add migration for pytorch 2.8 conda-forge/conda-forge-pinning-feedstock#7713

Merged

mgorny mentioned this pull request Aug 29, 2025

Some recipe fixes from 2.8.0 bump #411

Closed

5 tasks

mgorny added 3 commits September 15, 2025 14:57

Add a pyyaml test dependency

17f70ad

Signed-off-by: Michał Górny <[email protected]>

Fix delayed variable expansion on Windows

1bfcde1

Signed-off-by: Michał Górny <[email protected]>

Constrain libopenblas to openmp

32bd686

Signed-off-by: Michał Górny <[email protected]>

mgorny force-pushed the v2.8.0 branch 2 times, most recently from 8e57db5 to 9ce5214 Compare September 15, 2025 13:46

h-vetinari mentioned this pull request Sep 16, 2025

increase storage for cirun-azure-windows-4xlarge conda-forge/.cirun#115

Merged

mgorny added 4 commits September 20, 2025 06:57

MNT: Re-rendered with conda-smithy 3.52.2 and conda-forge-pinning 202…

1237fbb

…5.09.18.12.55.27 Other tools: - conda-build 25.7.0 - rattler-build 0.47.0 - rattler-build-conda-compat 1.4.6

Revert actions/checkout to working version

40956c9

Signed-off-by: Michał Górny <[email protected]>

Fix triton pin

319736b

Signed-off-by: Michał Górny <[email protected]>

Try resizing the partitions before the build

1982c79

Thanks to @aktech for the suggestion. Signed-off-by: Michał Górny <[email protected]>

mgorny force-pushed the v2.8.0 branch from 09621f4 to 1982c79 Compare September 20, 2025 07:05

h-vetinari mentioned this pull request Sep 20, 2025

[bot-automerge] xformers v0.0.32.post2 conda-forge/xformers-feedstock#58

Merged

3 tasks

mgorny marked this pull request as ready for review September 20, 2025 15:50

mgorny requested review from Tobias-Fischer, baszalmstra, beckermr, benjaminrwilson, hmaarrfk, jeongseok-meta and sodre as code owners September 20, 2025 15:50

h-vetinari approved these changes Sep 21, 2025

View reviewed changes

h-vetinari merged commit 034ea64 into conda-forge:main Sep 21, 2025
31 of 32 checks passed

h-vetinari reviewed Sep 22, 2025

View reviewed changes

recipe/meta.yaml Show resolved Hide resolved

This was referenced Sep 22, 2025

Segfaults with cudnn>=9.11 for pre-Turing devices (<=sm_70) conda-forge/cudnn-feedstock#124

Open

Failing upload for libtorch on windows (apparent timeout?) conda/infrastructure#1159

Open

h-vetinari mentioned this pull request Sep 23, 2025

potentially erroneous published archive for 2.8.0 #417

Closed

mgorny mentioned this pull request Sep 23, 2025

v1/rattler-build adoption and anaconda/conda-forge alignment strategy in general. #323

Open

mgorny deleted the v2.8.0 branch October 13, 2025 09:03

mgorny mentioned this pull request Oct 13, 2025

Fix prefix in installed files, and remove cudnn pin #425

Merged

5 tasks

Uh oh!

V2.8.0 #409

V2.8.0 #409

Uh oh!

Conversation

mgorny commented Aug 7, 2025

Uh oh!

conda-forge-admin commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mgorny commented Aug 8, 2025

Uh oh!

h-vetinari commented Aug 8, 2025

Uh oh!

mgorny commented Aug 11, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

mgorny commented Aug 11, 2025

Uh oh!

mgorny commented Aug 11, 2025

Uh oh!

h-vetinari commented Aug 12, 2025

Uh oh!

mgorny commented Aug 12, 2025

Uh oh!

mgorny commented Aug 13, 2025

Uh oh!

h-vetinari commented Aug 13, 2025

Uh oh!

mgorny commented Sep 15, 2025

Uh oh!

mgorny commented Sep 15, 2025

Uh oh!

hmaarrfk commented Sep 15, 2025

Uh oh!

h-vetinari commented Sep 16, 2025

Uh oh!

h-vetinari commented Sep 16, 2025

Uh oh!

mgorny commented Sep 16, 2025

Uh oh!

aktech commented Sep 16, 2025

Uh oh!

mgorny commented Sep 20, 2025

Uh oh!

h-vetinari left a comment

Choose a reason for hiding this comment

Uh oh!

mgorny commented Sep 21, 2025

Uh oh!

RoyiAvital commented Sep 21, 2025

Uh oh!

Uh oh!

h-vetinari left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

h-vetinari commented Sep 22, 2025

Uh oh!

mgorny commented Sep 22, 2025

Uh oh!

h-vetinari commented Sep 22, 2025

Uh oh!

Zalnd commented Sep 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

conda-forge-admin commented Aug 7, 2025 •

edited

Loading