[quantization] is there any plan to support 6bits quantization , as 6 bits quantization more efficient than 8 bits on arm cpu #15432
Replies: 4 comments
-
|
Hey, this is the MXNet Label Bot. |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the proposal. |
Beta Was this translation helpful? Give feedback.
-
As current ARM SIMD do not support int32 += int8xint8(except Cotex A55,A75),unfortunate 8bits MAC will overflow during 4X4 sgemm block(127x127x4 exceed int16 range ),so int8 need first convert to int16 then do int32 += int16xint16. but 6 bits will not overflow in 4x4 sgemm block,so 6 bits can use directly use int16+= int8xint8 MAC,more efficient than 8 bits |
Beta Was this translation helpful? Give feedback.
-
|
@tianylijun thanks for the explanation. The tensorcore compute the 4x4 GEMM with INT8 and seems it can handle the overflow. Did you have a chance to look into the difference? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there any plan to support 6bits quantization , as data overflow risk, 6 bits quantization more efficient than 8 bits on ARM CPU.
Beta Was this translation helpful? Give feedback.
All reactions