You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have recently added the partial_reduce_smla and partial_reduce_umla
nodes to represent Acc += ext(b) * ext(b) where the two extends have
to have the same source type, and have the same extend kind.
For riscv64 w/zvqdotq, we have the vqdot and vqdotu instructions
which correspond to the existing nodes, but we also have vqdotsu
which represents the case where the two extends are sign and zero
respective (i.e. not the same type of extend).
This patch adds a partial_reduce_sumla node which has sign
extension for A, and zero extension for B. The addition is somewhat
mechanical, except that it exposes an implementaion challenge
because AArch64 doesn't have an analogous instruction (that I've
found).
The current legalization table assumes that all of the partial_reduce*mla
variants have the same handling for a given type pair.
Questions to the AArch64 folks:
* Does aarch64 have a good implementation for this that I missed?
* If not, are you okay with my somewhat hacky custom legalization
approach (in this patch)? It does look like there are some small
regressions here, but I haven't dug into why.
* If not, any suggestions on how to structure splitting the legalization
table? I could add the opcode to the table key; that's probably the
easiest.
0 commit comments