Conversation
Add signed 4-bit integer support to Fusilli's type system, buffer allocation, and test infrastructure. Type system: - Add Int4 struct with uint8_t storage, clamping to [-8, 7], and sign extension (include/fusilli/support/int_types.h) - Add packInt4/unpackInt4 utilities for nibble packing conversion - Add DataType::Int4 with MLIR type "si4" to FUSILLI_FORALL_DATA_TYPES - Add IreeHalElementType<Int4> mapping to IREE_HAL_ELEMENT_TYPE_SINT_4 Buffer allocation: - Add packForIree/unpackFromIree helpers: no-ops for byte-aligned types, nibble-pack/unpack for Int4. Wired into the primary Buffer::allocate<T> and Buffer::read<T> templates so no specializations are needed. Tests: - Add Int4 struct and packing unit tests (test_int_types.cpp) - Add Int4 buffer allocation and read roundtrip test Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
Demonstrates Int4 LHS with fp16 RHS through torch.aten.matmul. Verifies correct packed nibble data roundtrip through IREE runtime. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
MaheshRavishankar
left a comment
There was a problem hiding this comment.
Interesting. This looks OK to me from my understanding of things. But this is the first sub-byte support, so maybe @sjain-stanford @AaronStGeorge or @rsuderman one of you wants to take a look as well?
I dont have any major comments, but I am happy to review again and stamp.
|
|
||
| // Signed 4-bit integer. Range: [-8, 7]. | ||
| // Stored in the low nibble of a uint8_t. | ||
| struct Int4 { |
There was a problem hiding this comment.
Just a high level question. Do we have existing type implementations in IREE that we can perhaps reuse? I'm fine with the concrete implementation here but it can be error prone and a maintenance overhead if we can reuse a more battle tested one that exists. We do the same for the float types (e.g. iree_math_f32_to_bf16 and iree_math_bf16_to_f32).
Also cc: @bjacob for visibility
There was a problem hiding this comment.
I couldn't find anything like this for int4 that would be helpful.
There was a problem hiding this comment.
Similar to the functions @sjain-stanford mentioned, my general suggestion here is: do you really need a class, how about just having some free functions? Part of the tension in trying to make this a class is palpable in how Int4::kBitWidth != 8 * sizeof(Int4). The class appears to be mostly a grab-bag of utility functions, so these might as well be freestanding.
There was a problem hiding this comment.
The main motivation behind using a class is for templates. For example, it allows us to do Buffer::allocate<Int4> instead of passing a std::vector<int8_t> + an additional parameter to tell the Buffer::allocate it's an int4. I added Int4 to making testing easier at the cost of performance. HipDNN shouldn't need to interact with the int4 class at all, just fusilli::DataType::Int4.
There was a problem hiding this comment.
I can fix this if there is a better way to do it. But for now I'll merge this to get fusilli::DataType::Int4 in for the plugin.
Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
Clamping to [-8, 7] already guarantees the value fits in 4 bits. The additional & 0x0F mask was a no-op. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
sjain-stanford
left a comment
There was a problem hiding this comment.
LGTM modulo hyper nits
Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>
Add signed 4-bit integer support to Fusilli's type system, buffer allocation, and testing.