Skip to content

MPY9 Vector Operations

Shahab edited this page Sep 24, 2021 · 6 revisions

Two 16bit element vectors:

Mnemonic  | OpB type | OpC type | Operation
----------|----------|----------|------------------
vadd2h    |  int16_t |  int16_t | (a.h1, a.h0) = (b.h1, b.h0) + (c.h1, c.h0)
vsub2h    |  int16_t |  int16_t | (a.h1, a.h0) = (b.h1, b.h0) - (c.h1, c.h0)
dmpyh     |  int16_t |  int16_t | ACC = (b.h1 * c.h1) + (b.h0 + c.h0); a = ACC.w0
dmpyhu    | uint16_t | uint16_t | a = ACC = (b.h1 * c.h1) + (b.h0 + c.h0)
dmach     |  int16_t |  int16_t | a = ACC = ACC + (b.h1 * c.h1) + (b.h0 + c.h0)
dmachu    | uint16_t | uint16_t | a = ACC = ACC + (unsigend) (b.h1 * c.h1) + (unsigned) (b.h0 + c.h0)
vaddsub2h |  int16_t |  int16_t | (a.h1, a.h0) = (b.h1 + b.h0) (c.h1 - c.h0)
vsubadd2h |  int16_t |  int16_t | (a.h1, a.h0) = (b.h1 - b.h0) (c.h1 + c.h0)
vmpy2h    |  int16_t |  int16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (b.h1, b.h0) * (c.h1, c.h0)
vmpy2hu   | uint16_t | uint16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (unsigned) (b.h1, b.h0) * (c.h1, c.h0)
vmac2h    |  int16_t |  int16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (b.h1, b.h0) * (c.h1, c.h0)
vmac2hu   | uint16_t | uint16_t | (ACC.w0, ACC.w1) = (A.w0, A.w1) = (ACC.w0, ACC.w1) + (unsigned)(b.h1, b.h0) * (unsigned)(c.h1, c.h0)

Asymmetric 2 element vector:

Mnemonic  | OpB type | OpC type | Operation
----------|----------|----------|------------------
dmpywh    |  int32_t |  int16_t | A = ACC = (B.w1 * c.h1) + (B.w0 + c.h0)
dmpywhu   |  int32_t |  int16_t | A = ACC = (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)
dmacwh    |  int32_t |  int16_t | A = ACC = ACC + (B.w1 * c.h1) + (B.w0 + c.h0)
dmacwhu   | uint32_t | uint16_t | A = ACC = ACC + (unsinged)(B.w1 * c.h1) + (unsigned)(B.w0 + c.h0)

Four element vector:

Mnemonic  | OpB type | OpC type | Operation
----------|----------|----------|------------------
qmpyh     |  int64_t |  int64_t | A = ACC = (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmpyhu    | uint64_t | uint64_t | A = ACC = (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
qmach     |  int64_t |  int64_t | A = ACC = ACC + (B.h3 * C.h3) + (B.h2 + C.h2) + (B.h1 * C.h1) + (B.h0 + C.h0)
qmachu    | uint64_t | uint64_t | A = ACC = ACC + (unsigned)(B.h3 * C.h3) + (unsigned)(B.h2 + C.h2) + (unsigned)(B.h1 * C.h1) + (unsigned)(B.h0 + C.h0)
vadd4h    |  int64_t |  int64_t | (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) + (C.h3, C.h2, C.h1, C.h0)
vsub4h    |  int64_t |  int64_t | (A.h3, A.h2, A.h1, A.h0) = (B.h3, B.h2, B.h1, B.h0) - (C.h3, C.h2, C.h1, C.h0)
vaddsub4h |  int64_t |  int64_t | A = (B.h3 + C.h3) (B.h2 - C.h2) (B.h1 + C.h1) (B.h0 - C.h0)
vsubadd4h |  int64_t |  int64_t | A = (B.h3 - C.h3) (B.h2 + C.h2) (B.h1 - C.h1) (B.h0 + C.h0)

Two 32bit element vector:

Mnemonic  | OpB type | OpC type | Operation
----------|----------|----------|------------------
vadd2     | int32_t  |  int32_t | (A.w1, A.w0) = (B.w1, B.w0) + (C.w1, C.w0)
vsub2     | int32_t  |  int32_t | (A.w1, A.w0) = (B.w1, B.w0) - (C.w1, C.w0)
vaddsub   | int32_t  |  int32_t | (A.w1, A.w0) = (B.w1 + B.w0),(C.w1 - C.w0)
vsubadd   | int32_t  |  int32_t | (A.w1, A.w0) = (B.w1 - B.w0),(C.w1 + C.w0)
Clone this wiki locally