Commit bf50e0a
committed
[NVPTX] Add TMA Bulk Copy intrinsics
PR #96083 added intrinsics for async copy of
'tensor' data using TMA. This PR adds intrinsics
for async copy of bulk data (non-tensor variants)
through TMA, following a similar design.
* These intrinsics optionally support multicast and
cache_hints, as indicated by the boolean arguments
at the end of the intrinsics.
* The backend looks through these flag arguments and
lowers to the appropriate PTX instruction.
* Lit tests are added for all combinations of these
intrinsics in cp-async-bulk.ll.
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst file.
PTX Spec reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-cp-async-bulk
Signed-off-by: Durgadoss R <[email protected]>1 parent 81ae668 commit bf50e0a
File tree
6 files changed
+393
-3
lines changed- llvm
- docs
- include/llvm/IR
- lib/Target/NVPTX
- test/CodeGen/NVPTX
6 files changed
+393
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
465 | 465 | | |
466 | 466 | | |
467 | 467 | | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
468 | 556 | | |
469 | 557 | | |
470 | 558 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4980 | 4980 | | |
4981 | 4981 | | |
4982 | 4982 | | |
| 4983 | + | |
| 4984 | + | |
| 4985 | + | |
| 4986 | + | |
| 4987 | + | |
| 4988 | + | |
| 4989 | + | |
| 4990 | + | |
| 4991 | + | |
| 4992 | + | |
| 4993 | + | |
| 4994 | + | |
| 4995 | + | |
| 4996 | + | |
| 4997 | + | |
| 4998 | + | |
| 4999 | + | |
| 5000 | + | |
| 5001 | + | |
| 5002 | + | |
| 5003 | + | |
| 5004 | + | |
| 5005 | + | |
| 5006 | + | |
| 5007 | + | |
| 5008 | + | |
| 5009 | + | |
| 5010 | + | |
| 5011 | + | |
| 5012 | + | |
| 5013 | + | |
| 5014 | + | |
| 5015 | + | |
| 5016 | + | |
| 5017 | + | |
| 5018 | + | |
| 5019 | + | |
| 5020 | + | |
| 5021 | + | |
| 5022 | + | |
| 5023 | + | |
| 5024 | + | |
| 5025 | + | |
4983 | 5026 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3024 | 3024 | | |
3025 | 3025 | | |
3026 | 3026 | | |
| 3027 | + | |
| 3028 | + | |
| 3029 | + | |
| 3030 | + | |
| 3031 | + | |
| 3032 | + | |
| 3033 | + | |
| 3034 | + | |
| 3035 | + | |
| 3036 | + | |
| 3037 | + | |
| 3038 | + | |
| 3039 | + | |
| 3040 | + | |
| 3041 | + | |
| 3042 | + | |
| 3043 | + | |
| 3044 | + | |
| 3045 | + | |
| 3046 | + | |
| 3047 | + | |
| 3048 | + | |
| 3049 | + | |
| 3050 | + | |
| 3051 | + | |
| 3052 | + | |
| 3053 | + | |
| 3054 | + | |
| 3055 | + | |
| 3056 | + | |
| 3057 | + | |
| 3058 | + | |
| 3059 | + | |
| 3060 | + | |
| 3061 | + | |
| 3062 | + | |
| 3063 | + | |
| 3064 | + | |
| 3065 | + | |
| 3066 | + | |
| 3067 | + | |
| 3068 | + | |
| 3069 | + | |
| 3070 | + | |
| 3071 | + | |
| 3072 | + | |
| 3073 | + | |
| 3074 | + | |
| 3075 | + | |
| 3076 | + | |
| 3077 | + | |
| 3078 | + | |
| 3079 | + | |
| 3080 | + | |
| 3081 | + | |
| 3082 | + | |
| 3083 | + | |
| 3084 | + | |
| 3085 | + | |
| 3086 | + | |
| 3087 | + | |
| 3088 | + | |
| 3089 | + | |
| 3090 | + | |
| 3091 | + | |
| 3092 | + | |
| 3093 | + | |
| 3094 | + | |
| 3095 | + | |
| 3096 | + | |
| 3097 | + | |
3027 | 3098 | | |
3028 | 3099 | | |
3029 | 3100 | | |
3030 | 3101 | | |
3031 | 3102 | | |
3032 | 3103 | | |
3033 | 3104 | | |
| 3105 | + | |
| 3106 | + | |
| 3107 | + | |
| 3108 | + | |
| 3109 | + | |
| 3110 | + | |
3034 | 3111 | | |
3035 | 3112 | | |
3036 | 3113 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| 93 | + | |
| 94 | + | |
93 | 95 | | |
94 | 96 | | |
95 | 97 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
498 | 498 | | |
499 | 499 | | |
500 | 500 | | |
501 | | - | |
502 | | - | |
503 | | - | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
| 516 | + | |
| 517 | + | |
| 518 | + | |
| 519 | + | |
| 520 | + | |
| 521 | + | |
| 522 | + | |
| 523 | + | |
| 524 | + | |
| 525 | + | |
| 526 | + | |
| 527 | + | |
| 528 | + | |
| 529 | + | |
| 530 | + | |
| 531 | + | |
| 532 | + | |
| 533 | + | |
| 534 | + | |
| 535 | + | |
| 536 | + | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
| 564 | + | |
| 565 | + | |
504 | 566 | | |
505 | 567 | | |
506 | 568 | | |
| |||
0 commit comments