Skip to content

Add shim DMA BD reuse tests for >16 task sequences#3026

Merged
erwei-xilinx merged 3 commits intoXilinx:mainfrom
erwei-xilinx:erwei/add-shim-dma-bd-reuse-test
Apr 9, 2026
Merged

Add shim DMA BD reuse tests for >16 task sequences#3026
erwei-xilinx merged 3 commits intoXilinx:mainfrom
erwei-xilinx:erwei/add-shim-dma-bd-reuse-test

Conversation

@erwei-xilinx
Copy link
Copy Markdown
Collaborator

Summary

  • Add pass-level test proving aie-assign-runtime-sequence-bd-ids accepts >16 tasks per tile when dma_await_task + dma_free_task recycle BD IDs between batches
  • Add E2E hardware test that fires 20 MM2S fire-and-forget tasks on one shim channel (exceeds 16-BD limit), alternating two source buffers, with BD IDs 0-7 recycled 3 times via periodic await+free
  • Verified on NPU2 hardware: all 20 transfers arrive correctly

Test plan

  • aie-opt --aie-assign-runtime-sequence-bd-ids good-bd-reuse-multi-buffer.mlir passes (BD IDs reused)
  • aiecc.py compiles shim_dma_bd_reuse/aie.mlir successfully
  • E2E test passes on NPU2 hardware: PASS! (20 MM2S fire-and-forget, BD reuse, 2 src buffers)

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 9, 2026 17:21
@erwei-xilinx erwei-xilinx requested a review from hunhoffe as a code owner April 9, 2026 17:21
Add two tests that verify buffer descriptor ID reuse via
dma_await_task + dma_free_task on shim tiles:

1. Pass-level test (good-bd-reuse-multi-buffer.mlir):
   Verifies the aie-assign-runtime-sequence-bd-ids pass accepts
   20 tasks on one tile when BD IDs are recycled between batches.

2. E2E hardware test (shim_dma_bd_reuse/):
   Fires 20 MM2S tasks on one shim channel (exceeds the 16-BD
   limit), alternating between two source buffers. BD IDs 0-7
   are recycled 3 times via periodic await+free. Core tile has
   a looping DMA passthrough. Host verifies all 20 transfers
   arrive correctly in the output buffer.

   This pattern matches the kv_cache_prefill RoPE LUT scenario
   where VIn sends hundreds of fire-and-forget MM2S tasks from
   two different memrefs on the same channel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new regression tests to validate shim DMA BD-ID reuse across runtime-sequence task batches, enabling >16 MM2S tasks per tile by awaiting/freeing tasks to recycle BD IDs.

Changes:

  • Add an E2E NPU XRT hardware test that issues 20 shim MM2S “fire-and-forget” tasks while recycling BD IDs.
  • Add an MLIR pass-level test for --aie-assign-runtime-sequence-bd-ids to prove BD IDs are reused across batches with multiple buffers.
  • Add a lit runner to compile and execute the new E2E test on NPU1/NPU2.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
test/npu-xrt/shim_dma_bd_reuse/test.cpp Host-side XRT test that runs the 20-transfer BD-reuse scenario and validates output ordering/data.
test/npu-xrt/shim_dma_bd_reuse/run.lit lit script to build xclbin/instructions, compile host, and run on NPU1/NPU2.
test/npu-xrt/shim_dma_bd_reuse/aie.mlir AIE design + runtime sequence implementing shim MM2S batching/await/free for BD reuse and core passthrough.
test/bd-chains-and-dma-tasks/assign-runtime-sequence-bd-ids/good-bd-reuse-multi-buffer.mlir aie-opt FileCheck test asserting BD-ID reuse across >16 tasks on one shim tile.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

erwei-xilinx and others added 2 commits April 9, 2026 10:43
…ds check

- Fix FileCheck patterns to use positional dma_bd format instead of
  non-existent "offset = N" named syntax
- Increase memref sizes from 4096 to 40960 to cover max offset+length
- Fix clang-format alignment on OUT_SIZE comment
- Add std::find_if end-iterator check before dereference
- Fix inaccurate "20 iterations" comment on continuous passthrough

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@erwei-xilinx erwei-xilinx requested a review from andrej April 9, 2026 17:47
@erwei-xilinx erwei-xilinx added this pull request to the merge queue Apr 9, 2026
Merged via the queue into Xilinx:main with commit 6d86140 Apr 9, 2026
74 of 75 checks passed
@erwei-xilinx erwei-xilinx deleted the erwei/add-shim-dma-bd-reuse-test branch April 9, 2026 19:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants