Regarding QGEMM contrib op definition #10815
-
Hi, This is regarding the definition of QGEMM contrib operator. I see that there is no beta parameter for this op where as it is present in the Gemm operator (Y = alpha A B + beta C). Even though there is no beta, the definition still uses beta for the computation of C scale as mentioned below.
Could someone please clarify the same? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
I understood why beta is not required for qgemm, this question can be closed. Below is my understanding Gemm = alpha AB + beta*C |
Beta Was this translation helpful? Give feedback.
I understood why beta is not required for qgemm, this question can be closed.
Below is my understanding
A = sA * (qA - zA), where qA is quantized A, sA is scale and zA is zeropoint.
B = sB * (qB - zB)
C = sC * (qC - zC)
As per the definition zC=0 and sC = alpha/beta * sA * sB
C = alpha/beta * sA * sB * qC
Gemm = alpha AB + beta*C
QGemm = alpha * sA * (qA - zA) * sB * (qB - zB) + beta * (alpha/beta * sA * sB * qC)
= alpha * sA * (qA - zA) * sB * (qB - zB) + alpha * sA * sB * qC