Skip to content

Conversation

@nirmie
Copy link
Contributor

@nirmie nirmie commented Jan 21, 2026

  • Packs f32 into fp6 and unpacks back to f32 for matmul
  • Pytests for quantization error, matmul error, and LUT rounding
  • Does not use CDNA4 FP6, only fake dequantization

Will soon work on adding CDNA4 native mxfp6 operations

- Packs f32 into fp6 and unpacks back to f32 for matmul
- Pytests for quantization error, matmul error, and LUT rounding
- Does not use CDNA4 FP6, only fake dequantization

Signed-off-by: Nirmal Senthilkumar <[email protected]>
# Calculate block-wise scales (Microscaling)
x_reshaped = x.view(-1, block_size)
amax = x_reshaped.abs().max(dim=1, keepdim=True).values
scales = amax / MAX_E2M3

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be a power-of-two value

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants