5555 simdgroup_load(data::MtlDeviceArray{T}, matrix_origin=(1, 1))
5656
5757Loads data from device or threadgroup memory into an 8x8 SIMD-group matrix
58- and returns it. `T` must be either `Float16` or `Float32 `.
58+ and returns it. `T` must be either `Float16`, `Float32`, or `BFloat16 `.
5959
6060# Arguments
6161- `matrix_origin::NTuple{2, Int64}=(1, 1)`: origin in the source memory to load from.
@@ -65,7 +65,7 @@ and returns it. `T` must be either `Float16` or `Float32`.
6565 simdgroup_store(src, dest::MtlDeviceArray{T}, matrix_origin=(1, 1))
6666
6767Stores data from an 8x8 SIMD-group matrix into device or threadgroup memory.
68- `T` must be either `Float16` or `Float32`.
68+ `T` must be either `Float16`, `Float32`, `BFloat16 `.
6969
7070# Arguments
7171- `matrix_origin::NTuple{2, Int64}=(1, 1)`: origin in the destination memory to store to.
@@ -119,7 +119,7 @@ The value for delta must be the same for all threads in the SIMD-group. This fun
119119doesn’t modify the upper delta lanes of data because it doesn’t wrap values around
120120the SIMD-group.
121121
122- T must be one of the following: Float32, Float16, Int32, UInt32, Int16, UInt16, Int8, or UInt8
122+ T must be one of the following: Float32, Float16, BFloat16, Int32, UInt32, Int16, UInt16, Int8, or UInt8
123123"""
124124simd_shuffle_down
125125
@@ -132,6 +132,6 @@ lane ID minus delta.
132132The value of delta must be the same for all threads in a SIMD-group. This function doesn’t
133133modify the lower delta lanes of data because it doesn’t wrap values around the SIMD-group.
134134
135- T must be one of the following: Float32, Float16, Int32, UInt32, Int16, UInt16, Int8, or UInt8
135+ T must be one of the following: Float32, Float16, BFloat16, Int32, UInt32, Int16, UInt16, Int8, or UInt8
136136"""
137137simd_shuffle_up
0 commit comments