Reland 9743ec0dca5bbd9dbce20adc3ee273af6b095f94 that is temporarily reverted in https://github.com/intel/intel-xpu-backend-for-triton/pull/2978 by 492ea92c05ac2fdde8abf7ded241442f029217ea.