-
Couldn't load subscription status.
- Fork 75
Generate native code using L0 sdk instead of ocloc
#5342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Anatoly Myachev <[email protected]>
| } | ||
| } | ||
|
|
||
| extern "C" EXPORT_FUNC PyObject *get_native_code(PyObject *args) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this function is taken from the implementation of load_binary function, as I've decided to leave it alone for now to avoid introducing regressions. However, there's potential to reduce code duplication there.
| zebin = f.read() | ||
| from triton.runtime.driver import driver | ||
| # at this stage the driver is already initialized | ||
| device = driver.active.utils.get_current_device() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The triton.compile maybe not target to the current device. The compilation flow should not depends on the runtime. It can be used as cross compiling and AOT compiling.
| self.get_native_code = mod.get_native_code | ||
| self.get_device_properties = mod.get_device_properties | ||
| self.device_count = mod.init_devices(self.get_sycl_queue()) | ||
| # breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # breakpoint() |
| try { | ||
| auto [l0_module_dgrf, l0_kernel_dgrf, n_spills_dgrf] = | ||
| compileLevelZeroObjects(binary_ptr, binary_size, kernel_name, | ||
| l0_device, l0_context, build_flags(), true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| l0_device, l0_context, build_flags(), true); | |
| l0_device, l0_context, build_flags(), true /*is_spv*/); |
|
|
||
| if (!PyArg_ParseTuple(args, "sSisi", &name, &py_bytes, &shared, | ||
| &build_flags_ptr, &devId)) { | ||
| std::cerr << "loadBinary arg parse failed" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| std::cerr << "loadBinary arg parse failed" << std::endl; | |
| std::cerr << "get_native_code arg parse failed" << std::endl; |
b580: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18599949719 (to check)
a770: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/18599941172 (to check)