Commit 205eedf
msftbot[bot]
Loop codegen improvements (#3632)
## PR Type
What kind of change does this PR introduce?
<!-- Please uncomment one or more that apply to this PR. -->
- Optimization
<!-- - Bugfix -->
<!-- - Feature -->
<!-- - Code style update (formatting) -->
<!-- - Refactoring (no functional changes, no api changes) -->
<!-- - Build or CI related changes -->
<!-- - Documentation content changes -->
<!-- - Sample app changes -->
<!-- - Other... Please describe: -->
## What is the current behavior?
<!-- Please describe the current behavior that you are modifying, or link to a relevant issue. -->
Some `for` loops have unoptimal codegen involving the indexing at each iteration.
For instance, as a very simple test, this method simply sets all items in an input `Span<int>` to `0`:
```csharp
public static void M1(Span<int> span)
{
ref int r0 = ref MemoryMarshal.GetReference(span);
int length = span.Length;
for (int i = 0; i < length; i++)
{
Unsafe.Add(ref r0, i) = 0;
}
}
```
```asm
C.M1(System.Span`1<Int32>)
L0000: mov rax, [rcx]
L0003: mov edx, [rcx+8]
L0006: xor ecx, ecx
L0008: test edx, edx
L000a: jle short L001c
L000c: movsxd r8, ecx
L000f: xor r9d, r9d
L0012: mov [rax+r8*4], r9d
L0016: inc ecx
L0018: cmp ecx, edx
L001a: jl short L000c
L001c: ret
```
Here the loop starts at `L000c`, and at every iteration it takes the loop counter, extends it to native int, then uses it to index from the initial reference (that `[rax+r8*4]` offset calculation), and then writes to it. This is unnecessary logic, in cases such as this.
## What is the new behavior?
<!-- Describe how was this issue resolved or changed? -->
Refactored some loops to operate within a target address range, with all indexing out of the loop body.
```csharp
public static void M2(Span<int> span)
{
ref int r0 = ref MemoryMarshal.GetReference(span);
ref int r1 = ref Unsafe.Add(ref r0, span.Length);
while (Unsafe.IsAddressLessThan(ref r0, ref r1))
{
r0 = 0;
r0 = ref Unsafe.Add(ref r0, 1);
}
}
```
```asm
C.M2(System.Span`1<Int32>)
L0000: mov rax, [rcx]
L0003: mov edx, [rcx+8]
L0006: movsxd rdx, edx
L0009: lea rdx, [rax+rdx*4]
L000d: cmp rax, rdx
L0010: jae short L001f
L0012: xor ecx, ecx
L0014: mov [rax], ecx
L0016: add rax, 4
L001a: cmp rax, rdx
L001d: jb short L0012
L001f: ret
```
Here instead we pre-calculate the target address just once, outside the loop, and then just iterate until the initial, moving reference reaches that point. This allows the actual loop to be more compact and with no indexing logic needed. We just read directly from the moving reference, and then increment it by a fixed amount at the end of each iteration. Not groundbreaking, but better 🚀
## PR Checklist
Please check if your PR fulfills the following requirements:
- [X] Tested code with current [supported SDKs](../readme.md#supported)
- [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~
- [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~
- [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~
- [X] Tests for the changes have been added (for bug fixes / features) (if applicable)
- [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*)
- [X] Contains **NO** breaking changesFile tree
6 files changed
+74
-42
lines changed- Microsoft.Toolkit.HighPerformance
- Helpers
- Memory
6 files changed
+74
-42
lines changedLines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| 146 | + | |
| 147 | + | |
146 | 148 | | |
147 | | - | |
| 149 | + | |
148 | 150 | | |
149 | | - | |
| 151 | + | |
150 | 152 | | |
151 | | - | |
| 153 | + | |
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
| |||
Lines changed: 5 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
148 | | - | |
| 148 | + | |
| 149 | + | |
149 | 150 | | |
150 | | - | |
| 151 | + | |
151 | 152 | | |
152 | | - | |
| 153 | + | |
153 | 154 | | |
154 | | - | |
| 155 | + | |
155 | 156 | | |
156 | 157 | | |
157 | 158 | | |
| |||
Lines changed: 5 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
143 | 143 | | |
144 | 144 | | |
145 | 145 | | |
| 146 | + | |
| 147 | + | |
146 | 148 | | |
147 | | - | |
| 149 | + | |
148 | 150 | | |
149 | | - | |
| 151 | + | |
150 | 152 | | |
151 | | - | |
| 153 | + | |
152 | 154 | | |
153 | 155 | | |
154 | 156 | | |
| |||
Lines changed: 5 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
152 | 152 | | |
153 | 153 | | |
154 | 154 | | |
155 | | - | |
| 155 | + | |
| 156 | + | |
156 | 157 | | |
157 | | - | |
| 158 | + | |
158 | 159 | | |
159 | | - | |
| 160 | + | |
160 | 161 | | |
161 | | - | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
Lines changed: 21 additions & 11 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
628 | 628 | | |
629 | 629 | | |
630 | 630 | | |
631 | | - | |
632 | 631 | | |
633 | 632 | | |
634 | 633 | | |
635 | | - | |
| 634 | + | |
| 635 | + | |
636 | 636 | | |
637 | | - | |
| 637 | + | |
638 | 638 | | |
639 | | - | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
640 | 643 | | |
641 | 644 | | |
642 | 645 | | |
| |||
682 | 685 | | |
683 | 686 | | |
684 | 687 | | |
685 | | - | |
| 688 | + | |
| 689 | + | |
686 | 690 | | |
687 | 691 | | |
688 | | - | |
| 692 | + | |
689 | 693 | | |
690 | | - | |
| 694 | + | |
| 695 | + | |
| 696 | + | |
| 697 | + | |
691 | 698 | | |
692 | 699 | | |
693 | 700 | | |
| |||
928 | 935 | | |
929 | 936 | | |
930 | 937 | | |
931 | | - | |
932 | 938 | | |
933 | 939 | | |
934 | 940 | | |
935 | | - | |
| 941 | + | |
| 942 | + | |
936 | 943 | | |
937 | | - | |
| 944 | + | |
938 | 945 | | |
939 | | - | |
| 946 | + | |
| 947 | + | |
| 948 | + | |
| 949 | + | |
940 | 950 | | |
941 | 951 | | |
942 | 952 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
691 | 691 | | |
692 | 692 | | |
693 | 693 | | |
694 | | - | |
| 694 | + | |
| 695 | + | |
695 | 696 | | |
696 | | - | |
| 697 | + | |
697 | 698 | | |
698 | | - | |
| 699 | + | |
| 700 | + | |
| 701 | + | |
699 | 702 | | |
700 | 703 | | |
701 | 704 | | |
| |||
738 | 741 | | |
739 | 742 | | |
740 | 743 | | |
741 | | - | |
742 | 744 | | |
743 | 745 | | |
744 | 746 | | |
745 | | - | |
| 747 | + | |
| 748 | + | |
746 | 749 | | |
747 | | - | |
| 750 | + | |
748 | 751 | | |
749 | | - | |
| 752 | + | |
| 753 | + | |
| 754 | + | |
| 755 | + | |
750 | 756 | | |
751 | 757 | | |
752 | 758 | | |
| |||
792 | 798 | | |
793 | 799 | | |
794 | 800 | | |
795 | | - | |
| 801 | + | |
| 802 | + | |
796 | 803 | | |
797 | 804 | | |
798 | | - | |
| 805 | + | |
799 | 806 | | |
800 | | - | |
| 807 | + | |
| 808 | + | |
| 809 | + | |
| 810 | + | |
801 | 811 | | |
802 | 812 | | |
803 | 813 | | |
| |||
868 | 878 | | |
869 | 879 | | |
870 | 880 | | |
871 | | - | |
| 881 | + | |
| 882 | + | |
872 | 883 | | |
873 | | - | |
| 884 | + | |
874 | 885 | | |
875 | | - | |
| 886 | + | |
| 887 | + | |
| 888 | + | |
876 | 889 | | |
877 | 890 | | |
878 | 891 | | |
| |||
1078 | 1091 | | |
1079 | 1092 | | |
1080 | 1093 | | |
1081 | | - | |
1082 | 1094 | | |
1083 | 1095 | | |
1084 | 1096 | | |
1085 | | - | |
| 1097 | + | |
| 1098 | + | |
1086 | 1099 | | |
1087 | | - | |
| 1100 | + | |
1088 | 1101 | | |
1089 | | - | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
1090 | 1106 | | |
1091 | 1107 | | |
1092 | 1108 | | |
| |||
0 commit comments