Improve performance of null coalescing assignment to array elements #8701
Replies: 3 comments
-
Codegen optimizations should be opened on Roslyn, not csharplang.
Unfortunately, that isn't what it shows. Accounting for error, it shows that the proposed change is statistically insignificant. We'd likely need more concrete evidence to make a change here. |
Beta Was this translation helpful? Give feedback.
-
Note there is a behavioral difference because of array covariance: object?[] arr = new string[] { "abc" };
arr[0] ??= new object(); // No exception, the setter won't ever run object?[] arr = new string[] { "abc" };
ref object? o = ref arr[0]; // ArrayTypeMismatchException is thrown here
o ??= new object(); |
Beta Was this translation helpful? Give feedback.
-
Ahh, array covariance, the gift that keeps on giving. I always forget about that. Upon further research, I found that C# does, in fact, use element references already when it can. Specifically when the reference type is sealed (i.e. Thanks for pointing out this flaw in my logic, I should have known the compiler guys were already on top of this. Also, next time I'll be sure to post stuff like this to the Roslyn repo instead of csharplang. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The null coalescing assignment operators
??=
when applied to array elements can in some cases have its performance improved. Specifically, when there are no thread-safety concerns for the array itself, such as if the array is stored in a local or a readonly field that would already be cached locally by the compiler. In this case, we can avoid indexing into the array twice by lowering the compound assignment operator to use a ref local.Note that this approach is already applied to
Span<T>
element null coalescing assignment today, just not to arrays.For example:
Currently lowers to:
Which produces the following JIT output:
However, written using a local ref the duplicate index into the array is avoided.
This produces the following JIT output which, to my understanding, is semantically equivalent with fewer instructions.
I think this should not apply in the following cases:
My (admittedly very rough) microbenchmark over a 20 million element array of nulls shows a modest improvement in throughput:
Beta Was this translation helpful? Give feedback.
All reactions