Skip to content

Conversation

@mgorny
Copy link
Contributor

@mgorny mgorny commented Oct 6, 2025

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Fixes #424

I've also rebased patches since git am wasn't happy with them.

mgorny added 6 commits October 6, 2025 17:19
Signed-off-by: Michał Górny <[email protected]>
The patches did not apply cleanly with `git am`, so let's rebase them
to ease future updates.

Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
…5.10.06.11.31.54

Other tools:
- conda-build 25.9.0
- rattler-build 0.47.1
- rattler-build-conda-compat 1.4.6

Signed-off-by: Michał Górny <[email protected]>
@conda-forge-admin
Copy link
Contributor

conda-forge-admin commented Oct 6, 2025

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • ℹ️ The magma output has been superseded by libmagma-devel.
  • ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
  • ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/18410815052. Examine the logs at this URL for more detail.

mgorny added 3 commits October 7, 2025 14:15
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
@mgorny
Copy link
Contributor Author

mgorny commented Oct 7, 2025

Generic builds failed due to typo in my patch (mea culpa!)

CUDA builds failed over the cudnn pin conflict:

Could not solve for environment specs
The following package could not be installed
└─ cudnn <9.11,(>=9.13.1.26,<10.0a0) * does not exist (perhaps a typo or a missing channel).

My understanding would be that building against cudnn 9.13 creates a pin that cannot be satisfied by <9.11 at test time. I see that we're listing cudnn as a "GPU requirement without run_exports", which doesn't seem to be true anymore.

So I guess we just need cudnn <9.11 in host dependencies and nothing in run dependencies, is that correct?

mgorny added 2 commits October 7, 2025 14:27
Signed-off-by: Michał Górny <[email protected]>
…5.10.07.10.53.53

Other tools:
- conda-build 25.9.0
- rattler-build 0.47.1
- rattler-build-conda-compat 1.4.6
@h-vetinari
Copy link
Member

So I guess we just need cudnn <9.11 in host dependencies and nothing in run dependencies, is that correct?

Yeah, that sounds right. Host will run-export >=9.10,<10, which is correct (at least for newer cards, until the 9.11+ builds are broken), but we still need the <9.11 for the test requirements too.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 7, 2025

Thanks. I'm also trying to address #418 and waiting for a local build to finish before pushing here.

mgorny added 2 commits October 7, 2025 17:11
…5.10.07.10.53.53

Other tools:
- conda-build 25.9.0
- rattler-build 0.47.1
- rattler-build-conda-compat 1.4.6

Signed-off-by: Michał Górny <[email protected]>
@mgorny
Copy link
Contributor Author

mgorny commented Oct 8, 2025

Okay, so:

  1. Most of the builds failed over pytorch-tests not being allowed. Filed Add pytorch-tests output for pytorch-cpu-feedstock admin-requests#1697 for that.

  2. Three builds failed over:

    + test '!' -f /home/conda/feedstock_root/build_artifacts/libtorch_1759859942001/_test_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placeho/lib/python3.13/site-packages/functorch/__pycache__/__init__.cpython-313.pyc
    

    I guess it's because of:

        python:                        3.13.7-h23354eb_100_cp313          conda-forge
        pytorch:                       2.8.0-cpu_generic_py313_haa53840_1 local      
        pytorch-tests:                 2.8.0-cpu_generic_py312_hbab034f_1 local      
    
  3. Windows CUDA failed over:

      - package libmagma-devel-2.9.0-h3ad809b_3 requires cuda-version >=12.9,<13, but none of the providers can be installed
    

    So I guess I should go back to plain magma here?

  4. The other Windows build failed over:

    %PREFIX%\Library\include\torch/extension.h(5): fatal error C1083: Cannot open include file: 'torch/all.h': No such file or directory
    

    I wonder if it's also related to Python mismatch between pytorch and pytorch-tests.

  5. OSX failed over:

    The following package could not be installed
    └─ pytorch =2.8.0 cpu_generic_py313_h1234567_1 does not exist (perhaps a typo or a missing channel).
    

    I guess it doesn't like the exact=True pin.

mgorny added 3 commits October 8, 2025 11:59
This reverts commit 547a218.

Signed-off-by: Michał Górny <[email protected]>
…5.10.07.10.53.53

Other tools:
- conda-build 25.9.0
- rattler-build 0.47.1
- rattler-build-conda-compat 1.4.6

Signed-off-by: Michał Górny <[email protected]>
Signed-off-by: Michał Górny <[email protected]>
@mgorny
Copy link
Contributor Author

mgorny commented Oct 8, 2025

Ok, so Windows build is failing because it can't find torch/all.h that's part of libtorch, so apparently it'd need -I$PREFIX/Library/include/torch/csrc/api/include while it gets:

-I%PREFIX%\Library\include -I%PREFIX%\lib\site-packages\torch\include -I%PREFIX%\lib\site-packages\torch\include\torch\csrc\api\include -I%PREFIX%\Include 

This works on Linux because we install a torch symlink to that site-packages directory, but we don't on Windows.

I see now that while changing lib_include to conda_include in my patch I didn't duplicate the other lib_include uses.

Signed-off-by: Michał Górny <[email protected]>
@mgorny
Copy link
Contributor Author

mgorny commented Oct 9, 2025

Uh, I think we need to increase the timeouts for OSX pipelines, but I don't want to restart all jobs over that.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 10, 2025

conda-forge.yml Outdated
settings_linux:
timeoutInMinutes: 1
settings_osx:
timeoutInMinutes: 480
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a hard limit of 360 minutes on azure. The only alternative to restarting (some runners are faster than others, but no way to select them) is to switch to the cirun setup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, that's bad. Azure runners have been failing very often recently, and the last time restarting didn't even help — the PR was stuck with two last runner that just kept timing out.

I see that we have a cirun-macos-m4-large but I guess that's only for arm64 builds?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's bad. AFAIK there's no feedstock using cirun-macos-m4-large yet, at least due to conda-forge/conda-smithy#2324.

It's on the list of things I want to tackle, but that's unlikely before November

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And FWICS it's x86 builds that are failing. I guess we could technically cross from arm64 but that sounds a little backwards (and perhaps even use emulation for testing).

@mgorny
Copy link
Contributor Author

mgorny commented Oct 10, 2025

Okay, I've reverted to the commit before Azure changes. I see GH reset the action results after all, but the status was green everywhere except for the two macOS builds. I've cancelled CIrun not to waste resources. Should we merge it as-is, or try to get Azure to pass 100% first?

@h-vetinari
Copy link
Member

Is this ready to merge? I don't think we've ever had a full run on the open gpu server, but I'm willing to take your word for it if you think it's ready. I think the osx builds can be solved with restarts (at least for now)

@mgorny
Copy link
Contributor Author

mgorny commented Oct 12, 2025

Is this ready to merge? I don't think we've ever had a full run on the open gpu server, but I'm willing to take your word for it if you think it's ready. I think the osx builds can be solved with restarts (at least for now)

I think so. I'm pretty sure this commit was all green on Linux and Windows, but unfortunately force-pushing erased the status.

Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@h-vetinari h-vetinari merged commit 9f6e138 into conda-forge:main Oct 12, 2025
29 of 40 checks passed
@mgorny mgorny deleted the prefix-fix branch October 13, 2025 09:02
@mgorny
Copy link
Contributor Author

mgorny commented Oct 13, 2025

Uh, looks like Windows package upload failed again.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 13, 2025

I'm going to try the instructions from #409 (comment). Hopefully I have the permissions to do that.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 13, 2025

Apparently not:

[ERROR] ('you do not have write privileges for package conda-forge/libtorch', 401)

@h-vetinari
Copy link
Member

h-vetinari commented Oct 13, 2025

Hopefully I have the permissions to do that. [...] Apparently not

If you've gotten as far as uploading the packages to your own channel, you can get to the finish line using something like conda-forge/admin-requests#1705 (only owners of the conda-forge channel can move packages there, which is cf/core, plus the bots).

@mgorny
Copy link
Contributor Author

mgorny commented Oct 13, 2025

Thanks, I'll do that. I was hoping there's some way of getting access per package, but that works too.

@mgorny
Copy link
Contributor Author

mgorny commented Oct 13, 2025

Uh-oh, just noticed that I've messed up the tests and they're now running for a single Python version only. Unlikely to break anything, given we've tested 2.8.0 before, but I need to fix this before the next PR. I guess a PR with a fix, then replace it with a "skip ci" to avoid building new packages over that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TypeError: unsupported operand type(s) for +: 'NoneType' and 'str' due to CONDA_PREFIX patches

3 participants