Conversation
|
Thanks @thomthehound for all the hard work you have put into this. I plan on testing this myself locally in addition to reviewing this PR. I would also be very interested in a chat if you are, feel free to reach out via email or IRON Discord. |
These changes would need to be merged into: https://github.com/Xilinx/cmakeModules before bumping the submodule |
Oops. I knew I was forgetting something. Thank you, PR submitted. |
Merged. You can now bump the submodule. |
jgmelber
left a comment
There was a problem hiding this comment.
I need to look through a few things a bit more, a lot of great work here! Here are a few initial comments
There was a problem hiding this comment.
Is this necessary to check in? Or perhaps a user specific preference?
There was a problem hiding this comment.
It's only there as a quick test that everything works. I intend to remove it later. The problem is that the makefiles for the examples have some bash-isms in them, and I didn't want to direct users to use Git Bash (honestly, I'm not even entirely sure that would work anyway). So I would have had to give bespoke manual commands for every example on Windows. Having an AI generate that just seemed faster.
There was a problem hiding this comment.
I thought about this some more, and I believe I actually did test the makefiles with Git Bash and MSYS2, and they failed because WSL support exists. So I still would have needed to re-write everything, even if I did direct users to POSIX to run the examples.
Until I get more clever, I really do think this is the best way to demonstrate the example programs can build in Windows.
For the record; its existence offends me, too.
There was a problem hiding this comment.
I'm OK with checking this in for now as a step to test the programming_examples on Windows natively. I think there are ways to improve the Makefiles to better support Windows. @jackl-xilinx started the initial work on supporting both Linux and WSL in them.
Maybe we should just use cmake...
There was a problem hiding this comment.
My thought exactly. I already have something in mind, but it is going to take a while.
|
Alright, those were great suggestions. Thank you! It will take me a few hours to revise and re-test. |
|
Updated
Updated
Notes:
|
For NPU3 it might be best to detect it but not treat it as something it is not, I am thinking this might cause a bug that's hard to track down in the future. |
Nuked. |
jgmelber
left a comment
There was a problem hiding this comment.
I believe this matches what we do/will see from xrt-smi
utils/iron_setup.py
Outdated
| # If we have anything that reports like an NPU, even if unknown type, accept it as at least XDNA. | ||
| NPU_REGEX = re.compile(r"(RyzenAI-npu(?![0-3]\b)\d+|\bNPU\b)", re.IGNORECASE) | ||
| NPU2_REGEX = re.compile(r"NPU Strix|NPU Strix Halo|NPU Krackan|NPU Gorgon|RyzenAI-npu[4567]", re.IGNORECASE) | ||
| NPU3_REGEX = re.compile(r"NPU Medusa|RyzenAI-npu[8]", re.IGNORECASE) |
There was a problem hiding this comment.
| NPU3_REGEX = re.compile(r"NPU Medusa|RyzenAI-npu[8]", re.IGNORECASE) | |
| NPU3_REGEX = re.compile(r"NPU Medusa|RyzenAI-npu[3]", re.IGNORECASE) |
utils/iron_setup.py
Outdated
|
|
||
| # If we have anything that reports like an NPU, even if unknown type, accept it as at least XDNA. | ||
| NPU_REGEX = re.compile(r"(RyzenAI-npu(?![0-3]\b)\d+|\bNPU\b)", re.IGNORECASE) | ||
| NPU2_REGEX = re.compile(r"NPU Strix|NPU Strix Halo|NPU Krackan|NPU Gorgon|RyzenAI-npu[4567]", re.IGNORECASE) |
There was a problem hiding this comment.
| NPU2_REGEX = re.compile(r"NPU Strix|NPU Strix Halo|NPU Krackan|NPU Gorgon|RyzenAI-npu[4567]", re.IGNORECASE) | |
| NPU2_REGEX = re.compile(r"NPU Strix|NPU Strix Halo|NPU Krackan|RyzenAI-npu[456]", re.IGNORECASE) |
utils/iron_setup.py
Outdated
|
|
||
|
|
||
| # If we have anything that reports like an NPU, even if unknown type, accept it as at least XDNA. | ||
| NPU_REGEX = re.compile(r"(RyzenAI-npu(?![0-3]\b)\d+|\bNPU\b)", re.IGNORECASE) |
There was a problem hiding this comment.
| NPU_REGEX = re.compile(r"(RyzenAI-npu(?![0-3]\b)\d+|\bNPU\b)", re.IGNORECASE) | |
| NPU_REGEX = re.compile(r"NPU Phoenix|RyzenAI-npu[1]", re.IGNORECASE) |
I think this is missing in the PR |
Ah. The docs were in the .gitignore list. I was pushing so much that I missed it. |
|
I believe there may be a bug with the GitHub actions that interferes with retrieving branches from forks (or, rather, it assumes all calls are coming from inside the house). This appears to be the proximal cause of the remaining failures. I suspect that, if there had happened to be a pre-existing branch named I think I see where the problem is, so I can prepare a patch for it if you like. However, I do not think repairing it is in the scope of this PR. On the other hand: this does provide the perfect opportunity to test such a patch. |
End-to-End Native Windows Support for IRON/MLIR-AIE
This PR adds a native Windows (non-WSL) path for building and running IRON projects:
if(WIN32),platform.system() == "Windows",os.name == "nt", etc.Provenance
Last summer, I began a private project -- which I called "IRONWINDOW" -- to port MLIR-AIE over to Windows. As I was doing it for myself, and for my own purposes ("for fun"), I made no effort to preserve the existing workflows and project layout. I had also assumed there would be little professional interest in a full Windows port. Ultimately, the project stalled as I ran into numerous problems with llvm-aie and MLIR. Several months later, I came across PR #2677 ("[TEST] Build wheels for Windows"), which proved my assumption incorrect (as I should have known, given how well-appointed for Windows the codebase was already). Taking up the project again, I was able to track down the specific code in LLVM-AIE and MLIR that were breaking Windows support and patch them. After that, MLIR-AIE "just worked".
At the time, I had hoped it would be a simple task to then translate my "IRONWINDOW" project back into something that fit within the MLIR-AIE codebase from which it had diverged. That task proved greater than I had anticipated. At times I felt like Ken Mattingly in the simulator, going through iteration after iteration of checklists to get the Apollo 13 command module back to Earth. However, after approximately a month and a half of working on this, I believe I finally have something worth presenting.
Steak and Potatoes
Nobody likes some outsider coming in and messing with their codebase; so I feel obligated to justify every single change I've made, and why. I'll try to keep this brief, as the code should speak for itself.
1) Cross-platform entrypoints for setup + example execution
New:
utils/iron_setup.pyquick_setup.sh,env_setup.shandopt/xilinx/xrt/setup.shback-to-backNew:
utils/run_example.pyprogramming_examples/*, simulating the Makefile in that it:These are additive utilities designed to be drop-in usable without breaking existing scripts. If we do start shipping Windows wheels, it is my hope that we unify all environments on something like
iron_setup.py, as using a single cross-platform script will make drift less likely going forward. Right now the env emission is somewhat awkward because, obviously, python is not designed to mess with a user's shell environment. However, I have written the file in such a way that it would be trivial to modify it to output platform-specific environment setup scripts to disk.I make no attempt to hide the fact that I used a programming AI to "vibe code"
utils/run_example.py. That file exists only to prove the Windows wheels actually work. Programming it by hand would have been more work than it was worth, given the fact that I expect it to be temporary. I did look it over to manually organize and clean up a few things, but it is not exactly craft-brewed.2) Windows wheel-building tools
The two additional Python scripts here are intended to allow Windows users to build their own wheels in case the GitHub Workflows don't pan out. I realize that most of what they do could have been accomplished with Git Bash, but the Windows setup is already frustrating enough as-is. Also, what is the point of going Windows-native if we're going to use POSIX as a crutch? Regardless, I needed a place to patch the Windows MLIR wheel as it is downloaded, and I saw no reason to uselessly stage it if it is only needed to build the wheel.
New:
utils/mlir_aie_wheels/scripts/download_mlir.pyutils/clone-llvm.shversion source to stay aligned with upstream versioningdiaguids.libNew:
utils/mlir_aie_wheels/scripts/build_local.pybuild_local.shthat just removes POSIX assumptionsUpdated:
utils/mlir_aie_wheels/pyproject.tomlandpython_bindings/pyproject.tomlbefore-buildnow usesdownload_mlir.py(no bash required)Updated:
utils/mlir_aie_wheels/setup.pyUpdated:
utils/mlir_aie_wheels/python_bindings/{pyproject.toml, setup.py, CMakeLists.txt}nanobind)before-buildnow usesdownload_mlir.py3) CMake fixes for Windows
Updated: top-level
CMakeLists.txt+tools/CMakeLists.txtAIE_BUILD_CHESS_CLANG(default ON elsewhere, default OFF on Windows)Updated:
cmake/modulesXilinx/FindXRT.cmakeUpdated:
tools/bootgen/CMakeLists.txt4) Small runtime/test infrastructure updates
Updated:
python/utils/compile/cache/utils.pyUpdated:
python/aie_lit_utils/lit_config_helpers.py-luuid)Updated:
python/CMakeLists.txtuuidlinking on Windows (not applicable)Updated:
python/requirements_dev.txt+python/requirements_ml.txtrequirements_dev.txt: adds missing build tooling (ninja,cibuildwheelneeded for Windows wheel builds) and bumps CMake floor to match what we actually userequirements_ml.txt: switches to--extra-index-urlfor torch so it doesn't clobber PyPI for anyone mixing requirement sets5) Documentation for Windows-native bring-up
New:
docs/buildHostWinNative.mdThis one was a real doozy. I did my best to trim it down to the essentials, but the XRT setup is going to be very annoying for the average user. I had to submit a separate PR to the XRT project just to get it to work at all. And, from my conversation with one of the code owners, it turns out that they build it on their end by pre-populating certain directories with tools from a private GitHub package. The average user has no such luxury. I offered to write a script to improve the situation, but I was told that any Python scripts would be rejected outright. I do not think a
.batfile is well suited for the kind of dependency management that XRT requires (anyone who has tried to nest if/else statements in cmd, not to mention debug using only the error "'.' was unexpected at this time.", knows what I am talking about here), and, even if I did put the time into it, I have no way of testing against their currently private build system, in which case I would be wasting my time. So... we have this. I understand where they are coming from over there, so I'm not upset about it. I'm just explaining why this is such a PITA. Perhaps in future I can come up with something better.As an aside: I noticed that the original
docs/buildHostWin.md(for WSL) made a point of only applying the trademark symbol to AMD™-owned properties... and not those of other companies. So, I had some fun with that to keep myself entertained as I worked. My compliments to the chef on that.CI / workflows
These are the only things I have not fully tested on my end, and therefore are the ones most likely to break:
mlirAIEDistro.yml: adds Windows wheel build coverage alongside existing platformsbuildRyzenWheels.yml: adds a Windows job that builds and stages artifacts consistently with existing jobsThe Windows jobs are designed to minimize special-casing. I largely copied them from the existing Linux jobs, so they should follow the same flow, aside from a few OS-specific fixes: most notably the need to use vcpkg to compile OpenSSL (so bootgen builds).
This is also one area where I did deliberately touch existing code, but for good reason. Assuming that we will now be publishing twice as many wheels (Linux + Windows), and we are building them in parallel, the possibility of a race condition arises when we push them out. Therefore, I moved publishing into a
publishjob that runs after both build jobs complete, and then updates releases sequentially.Results should be transparent relative to the existing behavior: version-tag runs still publish to the version tag (e.g.
vX.Y.Z), and scheduled/manual runs still update the same baseline tags (latest-wheels-3andlatest-wheels-no-rtti). The only difference is that Windows wheels are now included alongside the Linux ones, and the release update shouldn't Kentucky Derby. On the other hand, and speaking frankly, I am not exactly the world's foremost expert in GitHub actions and runners. So I would be surprised if this works the first time.Windows-specific patches
The patches for the MLIR and llvm-aie wheels are clearly delineated with banner-headers in
utils/mlir_aie_wheels/scripts/download_mlir.pyandutils/iron_setup.pyrespectively. But here is a brief outline:MLIR wheel (
download_mlir.py)diaguids.lib(from the DIA SDK)..../diaguids.libreference with either:VSINSTALLDIR/vswhere, ordiaguids.lib(which should show up in a normal library search path on any properly set up Windows machine)llvm-aie wheel (
iron_setup.py)c.lib/m.lib, but a lot of the build logic assumes Unix-stylelibc.a/libm.a.libc.aandlibm.aaliases (copies) so that existing Makefiles/CMake logic doesn't have to special-case Windows..deplibssection fromcrt1.oviallvm-objcopy, which prevents bogus MSVC runtime deps from leaking into the aie2p link line. This is the primary bug that stopped me in my tracks months ago.llvm-objcopypackaged in that wheel..deplibssection fromcrt1.oin llvm-aie, as that is the real fix.Errata:
pyxrtexisted?xrt_coreutil.dll. That is a signed binary, and I won't involve myself in packaging those. It is on the system PATH for all Windows users, anyway.host-onlyas a result of Microsoft's policies, which are enforced through the exports of the signed driver/binaries. Perhaps over-zealously. I anticipate that at least some documentation updates will be required to align expectations with Windows realities.flags.carvout, but, after failing with some test code, I dumped the exports ofpyxrt.pyd, and we simply don't have it. I'd like to say I can address that with another XRT patch, but... at some point I need to realize that I'm not getting paid to do this, nor has anybody asked me.... among other "MultiThreaded" (/MT) MSVC settings.
I didn't touch that, but I can imagine this being a problem on Windows because, again, we are using signed binaries without corresponding static import libs, and the XRT-built import libs (produced in both static and dynamic flavors) are almost certainly going to ABI mismatch or access fault against the drivers somewhere. I have not seen any problems with this yet, but that doesn't mean it won't happen.
utils/iron_setup.pyscript automatically sets env-vars on Windows that are equivalent to sourcingopt/xilinx/xrt/setup.sh, with the exception of setting the XRT directory toXRT_ROOTand notXILINX_XRT(which it force-unsets in case it accidentally appears). SettingXILINX_XRTdecouples from the system binaries on Windows, meaning the signed drivers are inaccessible and the NPU becomes unreachable.How to test
Windows (native)
docs/buildHostWinNative.md)python utils/iron_setup.py --allpython utils/iron_setup.py env --shell pwsh | iex(or--shell cmd)$env:CIBW_BUILD = "cp312-*"$env:OPENSSL_ROOT_DIR="$env:VCPKG_ROOT/installed/x64-windows"# Not needed if OpenSSL is installed standalone AND the libraries are in system directoriespython utils/mlir_aie_wheels/scripts/build_local.py# This thing trashes the disk almost as hard as a stress-test would. I'd like to streamline it more in the future.python utils/iron_setup.py install --mlir-aie wheelhousepython utils/iron_setup.py env --shell pwsh | iex# Againpython utils/run_example.py run --example-dir programming_examples/basic/vector_scalar_mulLinux/WSL
iron_setup.pyscript, as it is deliberately cross-platform. It would be nice if we could standardize on a single script to avoid drift in the future.