Skip to content

feat: Improve MusaShiftedAffineMap fusion operator#148

Merged
tngchien merged 1 commit intoMooreThreads:mainfrom
welo516:AddFusion
Apr 1, 2026
Merged

feat: Improve MusaShiftedAffineMap fusion operator#148
tngchien merged 1 commit intoMooreThreads:mainfrom
welo516:AddFusion

Conversation

@welo516
Copy link
Copy Markdown
Contributor

@welo516 welo516 commented Apr 1, 2026

  • Add graph fusion pattern (Match/Apply) for the pattern: AddV2 ← Mul(AddV2(data, StridedSlice←ReadVariableOp), Select) + AddV2(data, StridedSlice←ReadVariableOp)
  • Add MusaShiftedAffineMap custom Op and MUSA kernel (4x mBinary)
  • Add end-to-end fusion test covering correctness and negative cases

refactor: update ShiftedAffineMap to final pattern (direct StridedSlice right branch)

docs: add exact output formula to header comment for ShiftedAffineMap

Fix shifted affine map fusion identity handling

Improve shifted affine map tests

Add single-kernel shifted affine map op

Add shifted affine map benchmark

Optimize MusaShiftedAffineMapOp: add FastPath for same-shape tensors

Add fusion

Fix NameError in fusion test

Format and cleanup MusaShiftedAffineMap Op code

Optimize memory assignment via forward_input_or_allocate_output

Revert zero copy and add musaGetLastError check

- Add graph fusion pattern (Match/Apply) for the pattern:
  AddV2 ← Mul(AddV2(data, StridedSlice←ReadVariableOp), Select)
           + AddV2(data, StridedSlice←ReadVariableOp)
- Add MusaShiftedAffineMap custom Op and MUSA kernel (4x mBinary)
- Add end-to-end fusion test covering correctness and negative cases

refactor: update ShiftedAffineMap to final pattern (direct StridedSlice right branch)

docs: add exact output formula to header comment for ShiftedAffineMap

Fix shifted affine map fusion identity handling

Improve shifted affine map tests

Add single-kernel shifted affine map op

Add shifted affine map benchmark

Optimize MusaShiftedAffineMapOp: add FastPath for same-shape tensors

Add fusion

Fix NameError in fusion test

Format and cleanup MusaShiftedAffineMap Op code

Optimize memory assignment via forward_input_or_allocate_output

Revert zero copy and add musaGetLastError check
@tngchien tngchien merged commit 204c697 into MooreThreads:main Apr 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants