Commit 21234ed
authored
[Codegen][GPU] Infer workgroup size multiples from producers and consumers (#19804)
This PR adds new logic in ConfigUtils.cpp to analyze a dispatch and
determine required multiples of workgroup tile sizes for the root
operation. This affects dispatches that contain either tensor.pack or
tensor.unpack ops, since the pack and unpack ops require the workgroup
tile sizes to be a multiple of their inner_tiles in order for them to be
fused into the workgroup scf.forall loop. The following example of a gpu
set_encoding dispatch illustrates the new constraint imposed by this PR:
```mlir
%in = flow.dispatch.tensor.load ... -> tensor<256x64xi8>
%pack = tensor.pack %in ... inner_tiles = [128, 64] ... tensor<256x64xi8> -> tensor<2x1x128x64xi8>
%expanded = tensor.expand_shape %pack [[0], [1], [2, 3, 4], [5, 6, 7]]
: tensor<2x1x128x64xi8> into tensor<2x1x4x8x4x2x4x8xi8>
// linalg.transpose is the root op. The workgroup tile sizes must contain an
// even multiple of the tensor.pack inner_tiles.
%transposed = linalg.transpose
ins(%expanded : tensor<2x1x4x8x4x2x4x8xi8>)
outs(%empty : tensor<2x1x8x4x4x4x2x8xi8>)
permutation = [0, 1, 3, 6, 2, 4, 5, 7]
flow.dispatch.tensor.store %transposed
```
Since the linalg.transpose is the root op, it needs to be aware of its
producer chain when selecting tile sizes. With this PR, the lowering
config selection logic will walk producers until it hits an unsupported
operation or a block argument, and find the LCM of any pack or unpack
tiles along the dimensions of their inner_tiles. In the above example,
this would look like the following:
1. Walk producer chain up to the producer of `tensor.pack`, and stop at
the `flow.dispatch.tensor.load`. The initial workgroup tile size
multiples will be `[1, 1]` (i.e., no constraint for unsupported ops).
2. The workgroup tile sizes will be propagated through the
`tensor.pack`, which updates the workgroup tile size multiples to `[1,
1, 128, 64]`.
3. Then, it will propagate through the `tensor.expand_shape`, which will
expand the workgroup size multiples if possible. In this case, they are
expanded to `[1, 1, 4, 8, 4, 2, 4, 8]`.
4. Now walk the consumer chain to find the multiples for the workgroup
tile slice of the root op result. In this case, the propagation simply
stops at the `flow.dispatch.tensor.store`, and the multiples are `[1, 1,
1, ...]`.
5. Now the root op has the required workgroup tile size multiples for
the operand and result slices, and the multiples for the iteration space
of the op are computed based on the indexing maps of the operation, by
taking the LCM along each dimension of that dimension's multiples from
all operands and results. In this case the final workgroup tile size
multiples would become `[1, 1, 8, 4, 4, 4, 2, 8]`.
---------
Signed-off-by: Max Dawkins <[email protected]>1 parent e4c683f commit 21234ed
File tree
9 files changed
+669
-51
lines changed- compiler/src/iree/compiler/Codegen
- Common
- Dialect/GPU/TargetUtils
- LLVMGPU
- test/ROCDL
9 files changed
+669
-51
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
| 157 | + | |
157 | 158 | | |
158 | 159 | | |
159 | 160 | | |
| |||
171 | 172 | | |
172 | 173 | | |
173 | 174 | | |
| 175 | + | |
174 | 176 | | |
175 | 177 | | |
176 | 178 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| 77 | + | |
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| |||
146 | 147 | | |
147 | 148 | | |
148 | 149 | | |
| 150 | + | |
149 | 151 | | |
150 | 152 | | |
151 | 153 | | |
| |||
Lines changed: 397 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 27 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
Lines changed: 2 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| 24 | + | |
24 | 25 | | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| 31 | + | |
30 | 32 | | |
31 | 33 | | |
32 | 34 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
27 | 28 | | |
28 | 29 | | |
29 | 30 | | |
| |||
Lines changed: 84 additions & 44 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| 24 | + | |
23 | 25 | | |
24 | 26 | | |
25 | 27 | | |
| |||
34 | 36 | | |
35 | 37 | | |
36 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
37 | 43 | | |
38 | 44 | | |
39 | 45 | | |
| |||
529 | 535 | | |
530 | 536 | | |
531 | 537 | | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
532 | 549 | | |
533 | 550 | | |
534 | 551 | | |
| |||
566 | 583 | | |
567 | 584 | | |
568 | 585 | | |
569 | | - | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
570 | 590 | | |
571 | | - | |
572 | | - | |
573 | | - | |
574 | | - | |
575 | | - | |
576 | | - | |
577 | | - | |
578 | | - | |
579 | | - | |
580 | | - | |
581 | | - | |
582 | | - | |
583 | | - | |
584 | | - | |
| 591 | + | |
| 592 | + | |
585 | 593 | | |
| 594 | + | |
586 | 595 | | |
587 | 596 | | |
588 | 597 | | |
| |||
592 | 601 | | |
593 | 602 | | |
594 | 603 | | |
595 | | - | |
| 604 | + | |
596 | 605 | | |
597 | 606 | | |
598 | 607 | | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
599 | 612 | | |
600 | 613 | | |
601 | 614 | | |
602 | 615 | | |
603 | 616 | | |
604 | | - | |
605 | | - | |
606 | | - | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
| 621 | + | |
| 622 | + | |
| 623 | + | |
607 | 624 | | |
608 | 625 | | |
609 | 626 | | |
| |||
629 | 646 | | |
630 | 647 | | |
631 | 648 | | |
632 | | - | |
633 | | - | |
| 649 | + | |
| 650 | + | |
634 | 651 | | |
635 | 652 | | |
636 | | - | |
637 | | - | |
638 | | - | |
639 | 653 | | |
640 | 654 | | |
641 | 655 | | |
| |||
648 | 662 | | |
649 | 663 | | |
650 | 664 | | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
651 | 679 | | |
652 | 680 | | |
653 | 681 | | |
654 | | - | |
655 | | - | |
656 | | - | |
| 682 | + | |
| 683 | + | |
657 | 684 | | |
| 685 | + | |
| 686 | + | |
| 687 | + | |
658 | 688 | | |
659 | 689 | | |
660 | 690 | | |
| |||
674 | 704 | | |
675 | 705 | | |
676 | 706 | | |
677 | | - | |
678 | | - | |
| 707 | + | |
| 708 | + | |
| 709 | + | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
679 | 718 | | |
680 | 719 | | |
681 | 720 | | |
| |||
685 | 724 | | |
686 | 725 | | |
687 | 726 | | |
688 | | - | |
689 | | - | |
690 | | - | |
691 | | - | |
692 | | - | |
693 | | - | |
694 | | - | |
695 | | - | |
696 | | - | |
697 | | - | |
698 | | - | |
699 | | - | |
700 | | - | |
701 | | - | |
702 | | - | |
703 | 727 | | |
704 | 728 | | |
705 | 729 | | |
| |||
726 | 750 | | |
727 | 751 | | |
728 | 752 | | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
| 756 | + | |
| 757 | + | |
| 758 | + | |
| 759 | + | |
| 760 | + | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
729 | 769 | | |
730 | 770 | | |
731 | 771 | | |
| |||
Lines changed: 19 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2520 | 2520 | | |
2521 | 2521 | | |
2522 | 2522 | | |
2523 | | - | |
2524 | | - | |
2525 | | - | |
2526 | | - | |
| 2523 | + | |
| 2524 | + | |
| 2525 | + | |
| 2526 | + | |
| 2527 | + | |
2527 | 2528 | | |
2528 | 2529 | | |
2529 | | - | |
2530 | | - | |
| 2530 | + | |
| 2531 | + | |
2531 | 2532 | | |
2532 | 2533 | | |
2533 | 2534 | | |
| |||
2554 | 2555 | | |
2555 | 2556 | | |
2556 | 2557 | | |
2557 | | - | |
| 2558 | + | |
| 2559 | + | |
2558 | 2560 | | |
2559 | 2561 | | |
2560 | 2562 | | |
| |||
2564 | 2566 | | |
2565 | 2567 | | |
2566 | 2568 | | |
| 2569 | + | |
| 2570 | + | |
| 2571 | + | |
| 2572 | + | |
| 2573 | + | |
| 2574 | + | |
| 2575 | + | |
| 2576 | + | |
| 2577 | + | |
| 2578 | + | |
2567 | 2579 | | |
2568 | 2580 | | |
2569 | 2581 | | |
| |||
0 commit comments