Currently, gpu.shuffle->llvm lowering supports just a handful of types: i8, i16, i32, i64, f16, f32, f64. This is missing a couple of types we may wanna handle. Currently, we have to cast original values to supported types in order to use the operation. Extending the operation upstream should help alleviate this need.
Add support for:
bf16 (by bitcasting to i16 for now)
i1 (try using SPIR-V builtin or by previous extension to i8)
Note shuffling of other value types like pointers should still be handled in our codebase.