Add int4 datatype support by IanWood1 · Pull Request #268 · iree-org/fusilli

IanWood1 · 2026-03-24T19:08:46Z

Add signed 4-bit integer support to Fusilli's type system, buffer allocation, and testing.

Add signed 4-bit integer support to Fusilli's type system, buffer allocation, and test infrastructure. Type system: - Add Int4 struct with uint8_t storage, clamping to [-8, 7], and sign extension (include/fusilli/support/int_types.h) - Add packInt4/unpackInt4 utilities for nibble packing conversion - Add DataType::Int4 with MLIR type "si4" to FUSILLI_FORALL_DATA_TYPES - Add IreeHalElementType<Int4> mapping to IREE_HAL_ELEMENT_TYPE_SINT_4 Buffer allocation: - Add packForIree/unpackFromIree helpers: no-ops for byte-aligned types, nibble-pack/unpack for Int4. Wired into the primary Buffer::allocate<T> and Buffer::read<T> templates so no specializations are needed. Tests: - Add Int4 struct and packing unit tests (test_int_types.cpp) - Add Int4 buffer allocation and read roundtrip test Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

Demonstrates Int4 LHS with fp16 RHS through torch.aten.matmul. Verifies correct packed nibble data roundtrip through IREE runtime. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

MaheshRavishankar

Interesting. This looks OK to me from my understanding of things. But this is the first sub-byte support, so maybe @sjain-stanford @AaronStGeorge or @rsuderman one of you wants to take a look as well?

I dont have any major comments, but I am happy to review again and stamp.

include/fusilli/backend/runtime.h

include/fusilli/support/int_types.h

sjain-stanford · 2026-04-07T17:41:27Z

include/fusilli/support/int_types.h

+
+// Signed 4-bit integer. Range: [-8, 7].
+// Stored in the low nibble of a uint8_t.
+struct Int4 {


Just a high level question. Do we have existing type implementations in IREE that we can perhaps reuse? I'm fine with the concrete implementation here but it can be error prone and a maintenance overhead if we can reuse a more battle tested one that exists. We do the same for the float types (e.g. iree_math_f32_to_bf16 and iree_math_bf16_to_f32).

Also cc: @bjacob for visibility

I couldn't find anything like this for int4 that would be helpful.

Similar to the functions @sjain-stanford mentioned, my general suggestion here is: do you really need a class, how about just having some free functions? Part of the tension in trying to make this a class is palpable in how Int4::kBitWidth != 8 * sizeof(Int4). The class appears to be mostly a grab-bag of utility functions, so these might as well be freestanding.

The main motivation behind using a class is for templates. For example, it allows us to do Buffer::allocate<Int4> instead of passing a std::vector<int8_t> + an additional parameter to tell the Buffer::allocate it's an int4. I added Int4 to making testing easier at the cost of performance. HipDNN shouldn't need to interact with the int4 class at all, just fusilli::DataType::Int4.

I can fix this if there is a better way to do it. But for now I'll merge this to get fusilli::DataType::Int4 in for the plugin.

tests/test_int_types.cpp

samples/matmul/matmul_int4_fp16.cpp

tests/test_int_types.cpp

Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

Clamping to [-8, 7] already guarantees the value fits in 4 bits. The additional & 0x0F mask was a no-op. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

tests/test_int_types.cpp

tests/test_buffer.cpp

tests/test_int_types.cpp

sjain-stanford

LGTM modulo hyper nits

Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

IanWood1 force-pushed the int4-support branch from 3888d2e to 0703bf5 Compare March 24, 2026 19:12

IanWood1 force-pushed the int4-support branch from 0703bf5 to e6ae930 Compare March 25, 2026 17:18

Add int4 x fp16 mixed-precision matmul sample

d19d118

Demonstrates Int4 LHS with fp16 RHS through torch.aten.matmul. Verifies correct packed nibble data roundtrip through IREE runtime. Signed-off-by: Ian Wood <ianwood@u.northwestern.edu>

IanWood1 force-pushed the int4-support branch from e6ae930 to d19d118 Compare March 25, 2026 17:24

IanWood1 marked this pull request as ready for review March 25, 2026 17:38

IanWood1 requested review from MaheshRavishankar, rsuderman and sjain-stanford and removed request for sjain-stanford March 25, 2026 17:38

MaheshRavishankar reviewed Mar 30, 2026

View reviewed changes