-
Notifications
You must be signed in to change notification settings - Fork 11
MPY9 Vector Operations
Shahab edited this page Sep 30, 2021
·
6 revisions
Two 16bit element vectors:
Mnemonic | OpB type | OpC type | Operation
----------|----------|----------|------------------
vadd2h | int16_t | int16_t | (a.h1, a.h0) = (b.h1, b.h0) + (c.h1, c.h0)
vsub2h | int16_t | int16_t | (a.h1, a.h0) = (b.h1, b.h0) - (c.h1, c.h0)
dmpyh | int16_t | int16_t | ACC = (b.h1 * c.h1) + (b.h0 + c.h0); a = ACC.w0
dmpyhu | uint16_t | uint16_t | a = ACC = (b.h1 * c.h1) + (b.h0 + c.h0)
dmach | int16_t | int16_t | a = ACC = ACC + (b.h1 * c.h1) + (b.h0 * c.h0)
dmachu | uint16_t | uint16_t | a = ACC = ACC + (unsigend) (b.h1 * c.h1) + (unsigned) (b.h0 * c.h0)
vaddsub2h | int16_t | int16_t | (a.h1, a.h0) = (b.h1 + b.h0) (c.h1 - c.h0)
vsubadd2h | int16_t | int16_t | (a.h1, a.h0) = (b.h1 - b.h0) (c.h1 + c.h0)
vmpy2h | int16_t | int16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (b.h1, b.h0) * (c.h1, c.h0)
vmpy2hu | uint16_t | uint16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (unsigned) (b.h1, b.h0) * (c.h1, c.h0)
vmac2h | int16_t | int16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (b.h1, b.h0) * (c.h1, c.h0)
vmac2hu | uint16_t | uint16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (unsigned)(b.h1, b.h0) * (unsigned)(c.h1, c.h0)
Asymmetric 2 element vector:
Mnemonic | OpB type | OpC type | Operation
----------|----------|----------|------------------
dmpywh | int32_t | int16_t | A = ACC = (B.w1 * c.h1) + (B.w0 + c.h0)
dmpywhu | int32_t | int16_t | A = ACC = (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)
dmacwh | int32_t | int16_t | A = ACC = ACC + (B.w1 * c.h1) + (B.w0 + c.h0)
dmacwhu | uint32_t | uint16_t | A = ACC = ACC + (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)
Four element vector:
Mnemonic | OpB type | OpC type | Operation
----------|----------|----------|------------------
qmpyh | int64_t | int64_t | A = ACC = (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmpyhu | uint64_t | uint64_t | A = ACC = (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
qmach | int64_t | int64_t | A = ACC = ACC + (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmachu | uint64_t | uint64_t | A = ACC = ACC + (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
vadd4h | int64_t | int64_t | (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) + (C.h3, C.h2, C.h1, C.h0)
vsub4h | int64_t | int64_t | (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) - (C.h3, C.h2, C.h1, C.h0)
vaddsub4h | int64_t | int64_t | A = (B.h3 + C.h3) (B.h2 - C.h2) (B.h1 + C.h1) (B.h0 - C.h0)
vsubadd4h | int64_t | int64_t | A = (B.h3 - C.h3) (B.h2 + C.h2) (B.h1 - C.h1) (B.h0 + C.h0)
Two 32bit element vector:
Mnemonic | OpB type | OpC type | Operation
----------|----------|----------|------------------
vadd2 | int32_t | int32_t | (A.w1, A.w0) = (B.w1, B.w0) + (C.w1, C.w0)
vsub2 | int32_t | int32_t | (A.w1, A.w0) = (B.w1, B.w0) - (C.w1, C.w0)
vaddsub | int32_t | int32_t | (A.w1, A.w0) = (B.w1 + B.w0),(C.w1 - C.w0)
vsubadd | int32_t | int32_t | (A.w1, A.w0) = (B.w1 - B.w0),(C.w1 + C.w0)