Hi, I'm trying to use magi attention as backend in mcore to replace nvte. But it seems that a lot of adaptations is required: cp split, rope, undispatch. The complexity comes from the custom comm pattern adopted by magi designed for specific attn mask. I just want to use regular varlen + casual mask + swa, so simple zigzag would be efficient enough. I'm wondering if there is support for using zigzag dispatch mode so I won't need to change data dispatch and rope part.