Skip to content

Conversation

@zachlewis
Copy link
Collaborator

Re: #4854

Currently, we use cibuildwheel's test-command to both generate new type stubs and validate existing type stubs.

Typically, this results in the generation of {dist_path}/OpenImageIO/__init__.pyi, where {dist_path} is the directory in which the generated .whl lives.

This is fine -- necessary, even -- for locally generating the python stubs, but the addition of an "OpenImageIO" directory causes problems for our "upload_pypi" github task.

This PR is a workaround, which forces the stub generation script to use $PWD as the output path (instead of {dist_path}), which should be suitable for stub validation purposes.

In the future, we will have to revisit this, if we decide to replace the local python-stubs generation process with an automated CI-based process.

Re: AcademySoftwareFoundation#4854

Currently, we use `cibuildwheel`'s test-command to both generate new type stubs and validate existing type stubs.

Typically, this results in the generation of  `{dist_path}/OpenImageIO/__init__.pyi`, where `{dist_path}` is the directory in which the generated .whl lives.

This is fine -- necessary, even -- for locally generating the python stubs, but the addition of an "OpenImageIO" directory causes problems for our "upload_pypi" github task.

This PR is a workaround, which forces the stub generation script to use $PWD as the output path (instead of `{dist_path}`), which should be suitable for stub validation purposes.

In the future, we will have to revisit this, if we decide to replace the local python-stubs generation process with an automated CI-based process.

Signed-off-by: Zach Lewis <[email protected]>
@zachlewis
Copy link
Collaborator Author

@chadrik -- does this make sense to you?

@lgritz
Copy link
Collaborator

lgritz commented Aug 16, 2025

Is this ready to merge? I have no idea how to judge if it's gonna fix the broken upload to pypi.

@chadrik
Copy link
Contributor

chadrik commented Aug 16, 2025

Here's my understanding of the problem:

  • The upload_pypi job first downloads artifacts that were previously uploaded by the actions/upload-artifact step of all the previous cibw build jobs
  • however, ./wheelhouse/OpenImageIO/__init__.pyi is now being uploaded alongside the wheels in ./wheelhouse which trips up the upload

I believe the change proposed here would break the feature that allows developers to download the __init__.pyi artifact in the case that they want to review the results, particularly if there's been an unexpected change. cibuildwheel spins up a container to perform the build and test it, and in order to get files out of the container they must be written to /output within the container. This is why the cibuildwheel test command calls generate_stubs.py --out-path '/output'. If the output path is redirected to CWD instead of the /output folder, then the stubs won't make it out of the container into the wheelhouse/ folder and the result won't be uploaded for review.

The download-artifact step of upload_pypi looks for certain upload ids by filtering on cibw-*:

      - uses: actions/download-artifact@fa0a91b85d4f404e444e00e005971372dc801d16 # v4.1.8
        with:
          pattern: cibw-*
          path: dist
          merge-multiple: true

So a possible alternative that will hopefully solve the pypi upload problem without collateral damage would be to place the stubs in a separate upload:

      - uses: actions/upload-artifact@6f51ac03b9356f520e9adb1b1b7802705f340c2b # v4.5.0
        with:
          name: cibw-wheels-${{ matrix.python }}-${{ matrix.manylinux }}
          path: |
            ./wheelhouse/*.whl

      - uses: actions/upload-artifact@6f51ac03b9356f520e9adb1b1b7802705f340c2b # v4.5.0
        with:
          name: stubs-${{ matrix.python }}-${{ matrix.manylinux }}
          path: |
            ./wheelhouse/OpenImageIO/__init__.pyi
          # if stub validation fails we want to upload the stubs for users to review.
          # keep the python build in sync with the version specified in tool.cibuildwheel.overrides section of pyproject.toml
          if: always() && ${{ matrix.python =~ 'cp311-manylinux_.*64' }}

Note the conditional is untested so it requires vetting.

@lgritz
Copy link
Collaborator

lgritz commented Aug 22, 2025

Where are we on this?

Also, I think something additional has broken in the last two days, now I'm getting wheel building failures on Mac only, that I don't understand. If I do test pushes on my branch, it is failing at the same commit that succeeded a couple days ago, so I'm assuming maybe something changed on the runners or with some dependency? You can see the logs here:
https://github.com/AcademySoftwareFoundation/OpenImageIO/actions/workflows/wheel.yml

@zachlewis
Copy link
Collaborator Author

Oh, whoops -- I didn't see that Chad had replied. His solution is more robust, so lemme implement it, and then I'll have a look at those logs and I'll see if I can figure out what's going on with the mac wheels build.

@zachlewis
Copy link
Collaborator Author

Looking at the logs, it seems the macos runners were experiencing a fatal error trying to pull from https://gitlab.com/libtiff/libtiff.git -- that's gotta be a temporary hiccup. Let's see if the problem continues to persist...

@lgritz
Copy link
Collaborator

lgritz commented Aug 22, 2025

Why would problems with gitlib/libtiff only cause Mac Intel wheels to fail?

@lgritz
Copy link
Collaborator

lgritz commented Aug 22, 2025

And also... why do you think that it's related to libtiff in the first place?

@zachlewis
Copy link
Collaborator Author

I thought it was related to libtiff, because that's what this particular job seemed to indicate:
https://github.com/AcademySoftwareFoundation/OpenImageIO/actions/runs/17149761278/job/48653119058#step:4:793

I suspect it was an issue with a single Mac Intel job, which caused the reset of the Mac Intel jobs to shut down and report a failure as well.

Anyway, that no longer seems to be an issue.

But now we seem to be experiencing something else:

 cd /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/src/libOpenImageIO && /Applications/Xcode_15.2.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang++ -DEMBED_PLUGINS=1 -DOIIO_FREETYPE_VERSION=\"2.13.3\" -DOIIO_INTERNAL=1 -DOIIO_OPENEXR_CORE_DEFAULT=0 -DOIIO_PYTHON_VERSION=\"3.12.10\" -DOIIO_QT_VERSION=\"\" -DOIIO_TBB_VERSION=\"\" -DOIIO_USE_EXR_C_API=1 -DOpenColorIO_SKIP_IMPORTS -DOpenImageIO_EXPORTS -DUSE_FREETYPE=1 -DUSE_JPEG_TURBO=1 -DUSE_UHDR -DUSE_WEBP=1 -I/OpenEXR -I/var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/include/OpenImageIO -I/var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/include -I/var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/src/include -I/Users/runner/work/OpenImageIO/OpenImageIO/src/include -I/var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/include -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/Imath -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/OpenEXR -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/libpng16 -isystem /Library/Frameworks/Mono.framework/Headers -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/webp -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/freetype2 -Os -DNDEBUG -std=c++17 -arch x86_64 -isysroot /Applications/Xcode_15.2.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk -mmacosx-version-min=11 -fPIC -fvisibility=hidden -Wall -Wformat -Wformat=2 -Werror -Wno-unused-function -Wno-overloaded-virtual -Wno-unneeded-internal-declaration -Wno-unused-private-field -Wno-tautological-compare -Qunused-arguments -Wunknown-warning-option -Wno-unused-local-typedefs -fno-math-errno -MD -MT src/libOpenImageIO/CMakeFiles/OpenImageIO.dir/__/include/OpenImageIO/detail/pugixml/pugixml.cpp.o -MF CMakeFiles/OpenImageIO.dir/__/include/OpenImageIO/detail/pugixml/pugixml.cpp.o.d -o CMakeFiles/OpenImageIO.dir/__/include/OpenImageIO/detail/pugixml/pugixml.cpp.o -c /Users/runner/work/OpenImageIO/OpenImageIO/src/include/OpenImageIO/detail/pugixml/pugixml.cpp
    [ 84%] Linking CXX shared library ../../lib/libOpenImageIO.dylib
    cd /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/src/libOpenImageIO && /usr/local/bin/cmake -E cmake_link_script CMakeFiles/OpenImageIO.dir/link.txt --verbose=1
    ld: Undefined symbols:
      OpenImageIO_v3_1_4_Imf__3_2_4::Chromaticities::Chromaticities(Imath_3_1::Vec2<float> const&, Imath_3_1::Vec2<float> const&, Imath_3_1::Vec2<float> const&, Imath_3_1::Vec2<float> const&), referenced from:
          OpenImageIO::v3_1_4::OpenEXROutput::put_parameter(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, OpenImageIO::v3_1_4::TypeDesc, void const*, OpenImageIO_v3_1_4_Imf__3_2_4::Header&) in exroutput.cpp.o
      OpenImageIO_v3_1_4_Imf__3_2_4::TypedAttribute<Imath_3_1::Box<Imath_3_1::Vec2<float>>>::TypedAttribute(Imath_3_1::Box<Imath_3_1::Vec2<float>> const&), referenced from:
          OpenImageIO::v3_1_4::OpenEXROutput::put_parameter(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, OpenImageIO::v3_1_4::TypeDesc, void const*, OpenImageIO_v3_1_4_Imf__3_2_4::Header&) in exroutput.cpp.o
...

Two suspicious things:

  1. It seems that Imath_3_1 is being referenced somehow, even though we're installing 3.2.4?!
  2. I see -I/OpenEXR up pretty early in the clang++ command, followed later by -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/Imath -isystem /var/folders/vk/nx37ffx50hv5djclhltc26vw0000gn/T/tmptqmr978c/build/deps/dist/include/OpenEXR -- it makes me wonder if somehow clang is finding a different OpenEXR installed to the system level somewhere? Even if that were the case, I would think our IGNORE_HOMEBREWED_DEPS flag would exclude the system location from the search path....

@lgritz
Copy link
Collaborator

lgritz commented Aug 22, 2025

I think there are two openexr's on the system and it's mixing headers from one with library from another.

And a second look at the log indicates something weird with tiff, too. You can see that it first fails to find libtiff, then builds libtiff 4.6 locally, then re-does the find and thinks it gets 4.0.9! So I think that there, too, maybe there's a system install of an older version that's messing with things. But why did it all work until a couple days ago? Maybe they updated the Mac Intel runner images?

@lgritz
Copy link
Collaborator

lgritz commented Aug 22, 2025

Oh, no, the openexr one may be a non-problem, and simpler.

It's using openexr 3.2 and imath 3.1, which is fine. Their versions are not kept in sync.

@zachlewis
Copy link
Collaborator Author

And a second look at the log indicates something weird with tiff, too. You can see that it first fails to find libtiff, then builds libtiff 4.6 locally, then re-does the find and thinks it gets 4.0.9! So I think that there, too, maybe there's a system install of an older version that's messing with things. But why did it all work until a couple days ago? Maybe they updated the Mac Intel runner images?

This could be a red herring. I'm pretty sure I've come across this before... I could be mistaken, but I think that's a bug with how we're reporting the re-found library version, and that it doesn't actually reflect the version of the library that was just autobuilt -- specifically, if a too-early version of the library is found and we need to build a newer version, somewhere, the logs still claim the older version is being built. But I think the variable is ultimately being set correctly by the time the refind has completed. I think.

(There IS something weird with how we're building TIFF and libdeflate, but I haven't been able to put my finger on exactly what; but it causes problems if you try to reuse autobuilt libraries for incremental builds of OIIO...)

I think there are two openexr's on the system and it's mixing headers from one with library from another.

Hmm.

@lgritz
Copy link
Collaborator

lgritz commented Aug 29, 2025

Reminder that the next scheduled 3.0 release is Monday, and as far as I can tell, we are not building Intel Mac wheels correctly and I'm not sure the upload-to-PyPI works either.

@zachlewis
Copy link
Collaborator Author

Hey, I apologize for not working out a fix in time for the 3.0.10 release -- I hope to put this bad boy to bed some time this week. If you're cool with releasing a 3.0.10.1 hotfix, we should be able to verify that all python wheely stuff works as expected before the 3.1 release in twelve days.

At this moment, it's really not clear to me why or how things have changed with the intel mac runners, or why the system OpenEXR 3.1 headers are being found and prioritized over the auto-built 3.2 headers. In a worst-case scenario, we might have to roll back OpenEXR to 3.1 for the intel mac wheels, just to get things out the door... but I'd obviously rather fix this properly.

@lgritz
Copy link
Collaborator

lgritz commented Sep 2, 2025

Of course, we can do a tweak release any time we need to fix this.

@lgritz
Copy link
Collaborator

lgritz commented Sep 12, 2025

@zachlewis What do you think we should do here?

Even disabling Intel Mac wheels entirely, while very much not ideal, is better than the status quo, which is that we haven't been publishing any python wheels for the last 2 or 3 patch releases.

@zachlewis
Copy link
Collaborator Author

Today is the day where I figure out wtf is going on here and get stuff working.

Agreed that disabling the intel mac wheels entirely is preferable to not releasing any wheels...

@zachlewis
Copy link
Collaborator Author

...okay, tomorrow's the day where I figure out wtf is going wrong.

I've whittled things down to a place where now the problem seems to be a CMake-4 thing happening with one of the OCIO dependencies, yaml-cpp... and I know we have an auto-builder recipe for yaml-cpp, so I'll be looking into that next...

It's very very strange that these problems are suddenly occurring in the first place...

@zachlewis zachlewis force-pushed the fix_python_wheels_pypi_release branch 2 times, most recently from 4ba54a0 to a92ea7f Compare September 13, 2025 18:39
@lgritz lgritz added build / testing / port / CI Affecting the build system, tests, platform support, porting, or continuous integration. python Python APIs labels Sep 13, 2025
@zachlewis zachlewis force-pushed the fix_python_wheels_pypi_release branch from 6bc2425 to 8688560 Compare September 14, 2025 19:36
@zachlewis zachlewis force-pushed the fix_python_wheels_pypi_release branch 3 times, most recently from b57c7f9 to 557c3d3 Compare September 16, 2025 11:43
- revert to deployment target 10.15
- don't use CIBW_ENVIRONMENT to find the C / CXX compiler
- unset CMAKE_FIND_FRAMEWORK=NEVER
- unset NO_SYSTEM_FROM_IMPORTED=ON

Signed-off-by: Zach Lewis <[email protected]>
This reverts commit 557c3d3.

Signed-off-by: Zach Lewis <[email protected]>
hopefully, the improved IGNORE_HOMEBREWED_DEPS business will take care of needing to manually uninstall homebrewed stuff in a CI step, but it's not the end of the world if we have to revert this commit to get stuff to work.

Signed-off-by: Zach Lewis <[email protected]>
@zachlewis zachlewis force-pushed the fix_python_wheels_pypi_release branch from 785579d to 50064a4 Compare September 16, 2025 14:53
@zachlewis
Copy link
Collaborator Author

The plot has thickened: OCIO is having an extremely similar problem building the MacOS Intel wheels against OpenEXR:

FAILED: [code=1] /Users/runner/work/OpenColorIO/OpenColorIO/build/lib.macosx-10.9-x86_64-cpython-310/PyOpenColorIO/bin/ocioconvert
    : && /usr/bin/c++ -O3 -DNDEBUG -arch x86_64 -mmacosx-version-min=10.13 -Wl,-search_paths_first -Wl,-headerpad_max_install_names src/apps/ocioconvert/CMakeFiles/ocioconvert.dir/main.cpp.o -o /Users/runner/work/OpenColorIO/OpenColorIO/build/lib.macosx-10.9-x86_64-cpython-310/PyOpenColorIO/bin/ocioconvert  -Wl,-rpath,@loader_path -Wl,-rpath,@loader_path/..  src/libutils/oglapphelpers/libOpenColorIOoglapphelpers.a  src/apputils/libapputils.a  src/libutils/imageioapphelpers/libOpenColorIOimageioapphelpers.a  /Users/runner/work/OpenColorIO/OpenColorIO/build/lib.macosx-10.9-x86_64-cpython-310/PyOpenColorIO/libOpenColorIO.dylib  -framework OpenGL  -framework GLUT  -framework Cocoa  -framework ColorSync  -framework CoreFoundation  -framework CoreGraphics  -framework IOKit  -framework Metal  -framework CoreVideo  ext/dist/lib/libpystring.a  ext/dist/lib/libOpenEXR-3_1.a  ext/dist/lib/libImath-3_1.a  ext/dist/lib/libIlmThread-3_1.a  ext/dist/lib/libIex-3_1.a  ext/dist/lib/libz.a && cd /Users/runner/work/OpenColorIO/OpenColorIO/build/temp.macosx-10.9-x86_64-cpython-310/src/apps/ocioconvert && /usr/bin/strip -x /Users/runner/work/OpenColorIO/OpenColorIO/build/lib.macosx-10.9-x86_64-cpython-310/PyOpenColorIO/bin/ocioconvert
    ld: Undefined symbols:
      Imf_3_1::Header::Header(int, int, float, Imath_3_1::Vec2<float> const&, float, Imf_3_1::LineOrder, Imf_3_1::Compression), referenced from:
          OpenColorIO_v2_5dev::ImageIO::ImageIO() in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::ImageIO() in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::ImageIO(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::ImageIO(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::ImageIO(long, long, OpenColorIO_v2_5dev::ChannelOrdering, OpenColorIO_v2_5dev::BitDepth) in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::Impl::init(long, long, OpenColorIO_v2_5dev::ChannelOrdering, OpenColorIO_v2_5dev::BitDepth) in libOpenColorIOimageioapphelpers.a[2](imageio.cpp.o)
          OpenColorIO_v2_5dev::ImageIO::ImageIO(long, long, OpenColorIO_v2_5dev::ChannelOrdering, OpenColorIO_v2

But... I think I've solved the problem? There's a homebrew-installed OpenEXR 3.4.0 preinstalled to the macos-13 runner, apparently; and it looks like it's being installed to $HOMEBREW_PREFIX/Cellar. I think this must be new behavior, either in the latest OpenEXR recipe, or with HomeBrew itself.

Anyway, manually brew-uninstalling openexr and imath before building the wheels seems to fix the problem for us, and right now I'm checking to see if I've made the IGNORE_HOMEBREWED_DEPS heuristics smart enough to ignore brew-installed OpenEXR without requiring the brew-uninstall step...

@zachlewis zachlewis force-pushed the fix_python_wheels_pypi_release branch from 17709e1 to 52f7373 Compare September 16, 2025 15:56
@zachlewis
Copy link
Collaborator Author

Looks like we'll have to manually brew-uninstall openexr to get stuff working for the macos-intel wheels after all...

...but, finally, @lgritz, all our wheels seem to be building!

Functionally, this PR is good to merge; but I can take a stab at cleaning up the git history if you'd like!

@lgritz
Copy link
Collaborator

lgritz commented Sep 16, 2025

Functionally, this PR is good to merge; but I can take a stab at cleaning up the git history if you'd like!

No need! The only kind of merge we do in this project is a full squash, so it will all be smashed into a single commit automatically.

@lgritz
Copy link
Collaborator

lgritz commented Sep 16, 2025

I wonder if another way to have solved the problem would be to change our build_OpenEXR.cmake to default to using OpenEXR 3.4 and build_Imath.cmake to use 3.2.1. (Or just use the environment variable OpenEXR_BUILD_VERSION=3.4.0 and Imath_BUILD_VERSION=3.2.1 to control them without changing the cmake defaults!)

I think the issue is that we are bitten once again between subtle binary incompatibilities between OpenEXR built against Imath 3.1 versus OpenEXR built against Imath 3.2.

@zachlewis
Copy link
Collaborator Author

zachlewis commented Sep 16, 2025

Hah... well, yeah, in hindsight, of course that would have made the most sense...!

It took me a long time to realize that $HOMEBREW_PREFIX/Cellar/OpenEXR/Include and $HOMEBREW_PREFIX/Cellar/IMath/Include have apparently been sneaking into the include path somehow. Our -IGNORE_HOMEBREWED_DEPS option isn't as robust as I thought it needed to be, although it does seem to do something -- if the value is unset, the build system finds OpenEXR-3.4.0; otherwise, it builds and links OpenEXR and IMath as expected at one point, but later seems to try to find homebrew's libraries anyway...

Hmm. Maybe I need to tweak the CMAKE_PREFIX_PATH to make sure we're not picking up any CMakeConfigs...

(not planning on making any other changes to this PR, feedback pending)

@lgritz
Copy link
Collaborator

lgritz commented Sep 16, 2025

This failed a Windows job -- which may just be a spurious glitch, I'm rerunning it now, but I won't really hold this up, since it is clearly unrelated to your patch.

Is this ready to merge as-is from your perspective, @zachlewis?

@zachlewis
Copy link
Collaborator Author

Yes sir!

@lgritz
Copy link
Collaborator

lgritz commented Sep 16, 2025

I feel like the title of this PR doesn't really reflect what it's about, after all this. Would you like to take a stab at editing it (there's a button at the top for that) so that it will be more accurate in the git history and release notes?

@lgritz
Copy link
Collaborator

lgritz commented Sep 16, 2025

Also, one more thing... and it may be that we won't have a clear answer until I do a tagged release.

It seems to me that last time I tried to do a 3.0 release, it acted like it tried to upload to PyPI upon pushing the tag, then didn't try upon the release when it was supposed to? Does this ring a bell? I could be wrong, but I remembered another issue lurking. Perhaps after I backport this to 3.0 I should do a release (even though it's not the start of the month) just to force a PyPI update since we haven't had one for a while, and then we'll find out.

@lgritz lgritz merged commit 333b63b into AcademySoftwareFoundation:main Sep 17, 2025
92 of 93 checks passed
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request Sep 17, 2025
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request Sep 17, 2025
lgritz pushed a commit to lgritz/OpenImageIO that referenced this pull request Sep 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build / testing / port / CI Affecting the build system, tests, platform support, porting, or continuous integration. python Python APIs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants