Skip to content

Conversation

@isuruf
Copy link
Member

@isuruf isuruf commented Jul 7, 2025

Not sure why MKL_INTERFACE_LAYER used by libmkl_rt.so (linked to by numpy) would affect libmkl_intel_gf.so (linked to by pytorch). It didn't use to. Might be a bug/feature in newer MKL.

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Not sure why MKL_INTERFACE_LAYER used by libmkl_rt.so (linked to by
numpy) would affect libmkl_intel_gf.so (linked to by pytorch).
It didn't use to. Might be a bug/feature in newer MKL.
@conda-forge-admin
Copy link
Contributor

conda-forge-admin commented Jul 7, 2025

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • ℹ️ The recipe is not parsable by parser conda-souschef (grayskull). This parser is not currently used by conda-forge, but may be in the future. We are collecting information to see which recipes are compatible with grayskull.
  • ℹ️ The recipe is not parsable by parser conda-recipe-manager. The recipe can only be automatically migrated to the new v1 format if it is parseable by conda-recipe-manager.

This message was generated by GitHub Actions workflow run https://github.com/conda-forge/conda-forge-webservices/actions/runs/16190507878. Examine the logs at this URL for more detail.

@h-vetinari
Copy link
Member

Thanks a lot for the fix! I presume this closes #398. I think we should prioritize this over #397, so that people who can't yet use the newer abseil/etc. still get the fix ASAP.

@h-vetinari
Copy link
Member

Might be a bug/feature in newer MKL.

Something very strange about this though, because our current pytorch builds have a mkl >=2024.2.2,<2025.0a0 constraint, so it must be a change somewhere else, because mkl 2024 hasn't had any new build recently.

Copy link
Contributor

@mgorny mgorny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense. The upstream code seems terribly fragile. I mean, apparently it checks whether cblas_sdot() is precise enough, decides that it's not and instead uses fallback that's even less precise?!

@h-vetinari
Copy link
Member

Like #396, this is still running into pytorch/pytorch#153737.

Background: In the context of this failure happening in #393, I had found the unmerged pytorch/pytorch#127702, which didn't change anything, and we eventually did conda-forge/triton-feedstock#52; apparently that wasn't a complete fix.

@h-vetinari h-vetinari mentioned this pull request Jul 9, 2025
1 task
@h-vetinari
Copy link
Member

Interestingly, this is still not enough to fix the linalg issues:

+ python -c 'import numpy as np;import torch;x = torch.tensor([2], dtype=torch.complex128);assert torch.dot(x, x).real == 4.0'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
    import numpy as np;import torch;x = torch.tensor([2], dtype=torch.complex128);assert torch.dot(x, x).real == 4.0
                                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

@Tobias-Fischer
Copy link
Contributor

Should we use np.isclose() instead of == maybe?

@h-vetinari
Copy link
Member

Interestingly, this is still not enough to fix the linalg issues:

Brown paperbag moment: when regenerating the patches I unintentionally reverted Isuru's changes 🤦 apologies

@h-vetinari h-vetinari merged commit d0db8fd into conda-forge:main Jul 11, 2025
26 of 27 checks passed
@h-vetinari
Copy link
Member

Pushed a missing build number bump: 2a435a7

@h-vetinari
Copy link
Member

h-vetinari commented Jul 13, 2025

Documenting the windows copy operation (after ensuring old uploads in my channel are not under main label). Filename & ID from here.

$ gh run download 16209749249 --repo conda-forge/pytorch-cpu-feedstock --name conda_artifacts_16209749249_win_64_channel_targetsconda-forge_maincu_hca575dce
$ unzip pytorch-cpu-feedstock_conda_artifacts_.zip
$ cd bld/win-64
$ rm current_repodata.json index.html repodata*
$ ls
libtorch-2.7.1-cuda128_mkl_hca3f899_302.conda       pytorch-2.7.1-cuda128_mkl_py313_h46e6c8c_302.conda
pytorch-2.7.1-cuda128_mkl_py310_h124cda0_302.conda  pytorch-2.7.1-cuda128_mkl_py39_hecd1aea_302.conda
pytorch-2.7.1-cuda128_mkl_py311_h4f3c550_302.conda  pytorch-gpu-2.7.1-cuda128_mkl_h2fd0c33_302.conda
pytorch-2.7.1-cuda128_mkl_py312_hc8193e8_302.conda
$ ls | xargs anaconda upload
$ DELEGATE=h-vetinari
PACKAGE_VERSION=2.7.1
for package in libtorch pytorch pytorch-gpu; do
  anaconda copy --from-label main --to-label main --to-owner conda-forge ${DELEGATE}/${package}/${PACKAGE_VERSION}
done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants