Similar to #169058, but on AArch64. It seems that only the x86 backend optimizes any of its dynamic shuffle intrinsics to vectorshuffle consistently.
Someone did implement this for specifically 64-bit vectors (int8x8_t), but most people writing NEON code are using 128-bit vectors. The existing optimization also only works for tbl instructions with 1 source operand and all indices in-bounds; there are far more patterns that can be turned into shufflevectors.
I plan to implement this optimization; #169589 is a preparatory step.