Skip to content

MPY9 Vector Operations

Claudiu Zissulescu edited this page Sep 24, 2021 · 6 revisions

Two 16bit element vectors:

Mnemonic OpB type OpC type Operation
vadd2h int16_t int16_t (a.h1, a.h0) = (b.h1, b.h0) + (c.h1, c.h0)
vsub2h int16_t int16_t (a.h1, a.h0) = (b.h1, b.h0) - (c.h1, c.h0)
dmpyh int16_t int16_t a = ACC = (b.h1 * c.h1) + (b.h0 + c.h0)
dmpyhu uint16_t uint16_t a = ACC = (b.h1 * c.h1) + (b.h0 + c.h0)
dmach int16_t int16_t a = ACC = ACC + (b.h1 * c.h1) + (b.h0 + c.h0)
dmachu uint16_t uint16_t a = ACC = ACC + (unsigend) (b.h1 * c.h1) + (unsigned) (b.h0 + c.h0)
vaddsub2h int16_t int16_t (a.h1, a.h0) = (b.h1 + b.h0) (c.h1 - c.h0)
vsubadd2h int16_t int16_t (a.h1, a.h0) = (b.h1 - b.h0) (c.h1 + c.h0)
vmpy2h int16_t int16_t (ACC.w0, ACC.w1) = (A.w0, A.w1) = (b.h1, b.h0) * (c.h1, c.h0)
vmpy2hu uint16_t uint16_t (ACC.w0, ACC.w1) = (A.w0, A.w1) = (unsigned) (b.h1, b.h0) * (c.h1, c.h0)
vmac2h int16_t int16_t (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (b.h1, b.h0) * (c.h1, c.h0)
vmac2hu uint16_t uint16_t (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (unsigned)(b.h1, b.h0) * (unsigned)(c.h1, c.h0)

Asymmetric 2 element vector:

Mnemonic OpB type OpC type Operation
dmpywh int32_t int16_t A = ACC = (B.w1 * c.h1) + (B.w0 + c.h0)
dmpywhu int32_t int16_t A = ACC = (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)
dmacwh int32_t int16_t A = ACC = ACC + (B.w1 * c.h1) + (B.w0 + c.h0)
dmacwhu uint32_t uint16_t A = ACC = ACC + (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)

Four element vector:

Mnemonic OpB type OpC type Operation
qmpyh int64_t int64_t A = ACC = (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmpyhu uint64_t uint64_t A = ACC = (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
qmach int64_t int64_t A = ACC = ACC + (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmachu uint64_t uint64_t A = ACC = ACC + (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
vadd4h int64_t int64_t (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) + (C.h3, C.h2, C.h1, C.h0)
vsub4h int64_t int64_t (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) - (C.h3, C.h2, C.h1, C.h0)
vaddsub4h int64_t int64_t A = (B.h3 + C.h3) (B.h2 - C.h2) (B.h1 + C.h1) (B.h0 - C.h0)
vsubadd4h int64_t int64_t A = (B.h3 - C.h3) (B.h2 + C.h2) (B.h1 - C.h1) (B.h0 + C.h0)

Two 32bit element vector:

Mnemonic OpB type OpC type Operation
vadd2 int32_t int32_t (A.w1, A.w0) = (B.w1, B.w0) + (C.w1, C.w0)
vsub2 int32_t int32_t (A.w1, A.w0) = (B.w1, B.w0) - (C.w1, C.w0)
vaddsub int32_t int32_t (A.w1, A.w0) = (B.w1 + B.w0) (C.w1 - C.w0)
vsubadd int32_t int32_t (A.w1, A.w0) = (B.w1 - B.w0) (C.w1 + C.w0)
Clone this wiki locally