Skip to content

Conversation

@chengjunlu
Copy link
Contributor

This is workaround for feature requirement in #5153. The IGC build flag is updated when the large GRF mode is used in loading SPIRV kernel when register spill size > 1000.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a workaround to update build flag metadata when loading SPIRV kernels, specifically addressing feature requirement #5153. The change enables IGC build flag updates when large GRF mode is used during SPIRV kernel loading when register spill size exceeds 1000.

Key changes:

  • Modified C driver to return build flags as an additional return value from load_binary function
  • Updated Python compiler to capture and apply updated build flags to kernel metadata when they differ from original flags

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
third_party/intel/backend/driver.c Modified load_binary return value to include build flags string
python/triton/compiler/compiler.py Added logic to capture and update metadata with new build flags when changed

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 313 to 314
return Py_BuildValue("(OOiiis)", kernel_bundle_py, kernel_py, n_regs,
n_spills, n_max_threads, build_flags().data());
Copy link

Copilot AI Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of build_flags().data() to the return tuple creates a breaking change in the API. Consider versioning this function or providing a backward-compatible wrapper to avoid breaking existing callers that expect a 5-tuple instead of a 6-tuple.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is known breaking. It is a short term solution.

@chengjunlu
Copy link
Contributor Author

chengjunlu commented Oct 29, 2025

@etaf Please help to confirm the change works as expected.

@etaf
Copy link

etaf commented Oct 29, 2025

@etaf Please help to confirm the change works as expected.

Thanks.

@chengjunlu chengjunlu force-pushed the chengjun/feature_5153 branch from 2095c0a to b3419c1 Compare October 29, 2025 04:40
@anmyachev
Copy link
Contributor

@chengjunlu we have the following file in cache: ~/.triton/cache/QTPP7LDAKIPHLZY2LCDGJTPLKOKKXNI5HP4H7MHFPULVU5UT3MBA/kernel.json

That looks like:
{"hash": "84deffac60521e75e71a588664cdeb5394abb51d3bf87fb0e57d175a7693db02", "target": {"backend": "xpu", "arch": {"architecture": 13136561920, "device_id": 3034, "driver_version": "1.6.33578+15", "gpu_eu_count": 448, "gpu_subslice_count": 56, "has_atomic64": true, "has_bfloat16_conversions": true, "has_fp16": true, "has_fp64": true, "has_subgroup_2d_block_io": true, "has_subgroup_matrix_multiply_accumulate": true, "has_subgroup_matrix_multiply_accumulate_tensor_float32": false, "max_compute_units": 448, "max_num_sub_groups": 64, "max_work_group_size": 1024, "name": "Intel(R) Data Center GPU Max 1100", "platform_name": "Intel(R) oneAPI Unified Runtime over Level-Zero", "sub_group_sizes": [16, 32], "total_memory": 51522830336, "type": "gpu", "vendor": "Intel(R) Corporation", "version": "12.60.7"}, "warp_size": 32}, "num_warps": 4, "num_ctas": 1, "num_stages": 2, "cluster_dims": [1, 1, 1], "warp_size": 32, "optimize_epilogue": false, "enable_fp_fusion": true, "launch_cooperative_grid": false, "reduce_variable_liveness": true, "supported_fp8_dtypes": ["fp8e5", "fp8e4nv", "fp8e4b15"], "deprecated_fp8_dot_operand_dtypes": [], "default_dot_input_precision": "tf32", "allowed_dot_input_precisions": ["tf32", "tf32x3", "ieee"], "allow_fp8e4nv": true, "allow_fp8e4b15": true, "grf_mode": ["small", "large", "auto", "default"], "split_barriers_scope": "None", "max_num_imprecise_acc_default": 0, "extern_libs": [["libdevice", "/home/jovyan/intel-xpu-backend-for-triton/python/triton/backends/intel/lib/libsycl-spir64-unknown-unknown.bc"]], "debug": false, "backend_name": "intel", "sanitize_overflow": false, "generate_native_code": false, "arch": null, "instrumentation_mode": "", "cache_dir": "/home/jovyan/.triton/cache/QTPP7LDAKIPHLZY2LCDGJTPLKOKKXNI5HP4H7MHFPULVU5UT3MBA", "triton_version": "3.5.0", "threads_per_warp": 32, "shared": 0, "global_scratch_size": 0, "global_scratch_align": 1, "profile_scratch_size": 0, "profile_scratch_align": 1, "name": "kernel", "build_flags": ""}

We need to make sure build_flags is updated here as well.

@chengjunlu
Copy link
Contributor Author

@chengjunlu we have the following file in cache: ~/.triton/cache/QTPP7LDAKIPHLZY2LCDGJTPLKOKKXNI5HP4H7MHFPULVU5UT3MBA/kernel.json

We need to make sure build_flags is updated here as well.

This is a speific workaround only works for #5153. The build flag in json is only up to the make SPIRV stage not the load binary phase.
Do you have any concern that this may cause issue?

@anmyachev
Copy link
Contributor

Do you have any concern that this may cause issue?

On second thought, it seems there shouldn't be a problem here. What about #5402 (comment)?

@chengjunlu chengjunlu force-pushed the chengjun/feature_5153 branch from b3419c1 to 5de64d5 Compare October 30, 2025 08:29
@chengjunlu chengjunlu requested a review from anmyachev October 30, 2025 08:30
Copy link
Contributor

@anmyachev anmyachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@anmyachev anmyachev merged commit d165754 into main Oct 30, 2025
48 of 53 checks passed
@anmyachev anmyachev deleted the chengjun/feature_5153 branch October 30, 2025 14:56
chengjunlu and others added 2 commits October 30, 2025 16:19
…build flag is updated when the large GRF mode is used in loading SPIRV kernel when register spill size > 1000.

Signed-off-by: etaf <[email protected]>
Co-authored-by: Lu,Chengjun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor.

4 participants