Improve performance of null coalescing assignment to array elements #8701

brantburnett · 2024-11-29T17:07:35Z

brantburnett
Nov 29, 2024

The null coalescing assignment operators ??= when applied to array elements can in some cases have its performance improved. Specifically, when there are no thread-safety concerns for the array itself, such as if the array is stored in a local or a readonly field that would already be cached locally by the compiler. In this case, we can avoid indexing into the array twice by lowering the compound assignment operator to use a ref local.

Note that this approach is already applied to Span<T> element null coalescing assignment today, just not to arrays.

For example:

public class C {
    private readonly object?[] _field = new object?[100];
    
    public void EnsureInitialized(int index) {
        _field[index] ??= new object();
    }
}

Currently lowers to:

public void EnsureInitialized(int index)
{
    object[] field = _field;
    if (field[index] == null)
    {
        field[index] = new object();
    }
}

Which produces the following JIT output:

C.EnsureInitialized(Int32)
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push edi
    L0004: push esi
    L0005: mov esi, edx
    L0007: mov edi, [ecx+4]
    L000a: cmp esi, [edi+4]
    L000d: jae short L002e
    L000f: cmp dword ptr [edi+esi*4+8], 0
    L0014: jne short L002a
    L0016: mov ecx, 0x7214464
    L001b: call 0x0721300c
    L0020: push eax
    L0021: mov edx, esi
    L0023: mov ecx, edi
    L0025: call System.Runtime.CompilerServices.CastHelpers.StelemRef(System.Object[], IntPtr, System.Object)
    L002a: pop esi
    L002b: pop edi
    L002c: pop ebp
    L002d: ret
    L002e: call Internal.Runtime.CompilerHelpers.ThrowHelpers.ThrowIndexOutOfRangeException()
    L0033: int3

However, written using a local ref the duplicate index into the array is avoided.

public class C {
    private readonly object?[] _field = new object?[100];
    
    public void EnsureInitialized(int index) {
        ref object? element = ref _field[index];
        element ??= new object();
    }
}

This produces the following JIT output which, to my understanding, is semantically equivalent with fewer instructions.

C.EnsureInitialized(Int32)
    L0000: push esi
    L0001: mov ecx, [ecx+4]
    L0004: push 0x7214464
    L0009: call System.Runtime.CompilerServices.CastHelpers.LdelemaRef(System.Object[], IntPtr, Void*)
    L000e: mov esi, eax
    L0010: cmp dword ptr [esi], 0
    L0013: jne short L0026
    L0015: mov ecx, 0x7214464
    L001a: call 0x0721300c
    L001f: mov edx, esi
    L0021: call 0x681e95a4
    L0026: pop esi
    L0027: ret

I think this should not apply in the following cases:

Cases where the array reference could be mutated by another thread. Basically, the same scenarios that control if the array reference is cached locally in the method today. If it's safe to temporarily store an array reference, it should be equally safe to temporarily store an element reference.
Indexer properties should not apply, only array indexing

My (admittedly very rough) microbenchmark over a 20 million element array of nulls shows a modest improvement in throughput:

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.2314)
Unknown processor
.NET SDK 9.0.100
  [Host]     : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2
  Job-RCRLJA : .NET 9.0.0 (9.0.24.52809), X64 RyuJIT AVX2

InvocationCount=1  UnrollFactor=1

| Method   | Mean     | Error    | StdDev   | Ratio | RatioSD | Code Size |
|--------- |---------:|---------:|---------:|------:|--------:|----------:|
| Simple   | 735.9 ms | 14.43 ms | 22.47 ms |  1.00 |    0.04 |     363 B |
| RefLocal | 719.8 ms | 14.16 ms | 28.28 ms |  0.98 |    0.05 |     291 B |

333fred · 2024-11-29T17:23:38Z

333fred
Nov 29, 2024
Maintainer

Codegen optimizations should be opened on Roslyn, not csharplang.

My (admittedly very rough) microbenchmark over a 20 million element array of nulls shows a modest improvement in throughput:

Unfortunately, that isn't what it shows. Accounting for error, it shows that the proposed change is statistically insignificant. We'd likely need more concrete evidence to make a change here.

0 replies

huoyaoyuan · 2024-11-29T18:22:54Z

huoyaoyuan
Nov 29, 2024

Note there is a behavioral difference because of array covariance:

object?[] arr = new string[] { "abc" };
arr[0] ??= new object(); // No exception, the setter won't ever run

object?[] arr = new string[] { "abc" };
ref object? o = ref arr[0]; // ArrayTypeMismatchException is thrown here
o ??= new object();

0 replies

brantburnett · 2024-11-30T14:12:12Z

brantburnett
Nov 30, 2024
Author

@huoyaoyuan

Ahh, array covariance, the gift that keeps on giving. I always forget about that. Upon further research, I found that C# does, in fact, use element references already when it can. Specifically when the reference type is sealed (i.e. string) or for nullable reference types (i.e. int?). It only uses a separate index dereference for get and set when dealing with an unsealed reference type, such as my example of object above, presumably because of the previously mentioned array covariance.

Thanks for pointing out this flaw in my logic, I should have known the compiler guys were already on top of this. Also, next time I'll be sure to post stuff like this to the Roslyn repo instead of csharplang.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance of null coalescing assignment to array elements #8701

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Improve performance of null coalescing assignment to array elements #8701

Uh oh!

brantburnett Nov 29, 2024

Replies: 3 comments

Uh oh!

333fred Nov 29, 2024 Maintainer

Uh oh!

huoyaoyuan Nov 29, 2024

Uh oh!

Uh oh!

brantburnett Nov 30, 2024 Author

brantburnett
Nov 29, 2024

333fred
Nov 29, 2024
Maintainer

huoyaoyuan
Nov 29, 2024

brantburnett
Nov 30, 2024
Author