Reland f9688abe3d3d9caea7846ce41d5fa1da765f5e16 that is temporarily reverted in https://github.com/intel/intel-xpu-backend-for-triton/pull/2523 by 25a7cbad82beb0d81283263d3d3885174bd290b3.