Skip to content

Commit d434f43

Browse files
author
msftbot[bot]
authored
Fixed ArrayPoolBufferWriter<T> repeated new[] allocations (#3524)
## PR Type What kind of change does this PR introduce? <!-- Please uncomment one or more that apply to this PR. --> - Optimization <!-- - Bugfix --> <!-- - Feature --> <!-- - Code style update (formatting) --> <!-- - Refactoring (no functional changes, no api changes) --> <!-- - Build or CI related changes --> <!-- - Documentation content changes --> <!-- - Sample app changes --> <!-- - Other... Please describe: --> ## What is the current behavior? <!-- Please describe the current behavior that you are modifying, or link to a relevant issue. --> The `ArrayPoolBufferWriter<T>` uses the given `ArrayPool<T>` instance to resize its internal buffer when needed, which only works as expected when the array itself is small. The issue is that the `ArrayPool<T>.Shared` instance has an internal threshold set to `1024 * 1024`, over which it just allocates new arrays every time to avoid keeping very large arrays alive for a long time. That is perfectly fine, except for one little detail: once you get past that threshold, `ArrayPool<T>.Shared` **stops rounding up the requested size**. It instead returns an array with `new[]` of **exactly the requested size**, which absolutely kills the performance when used in a writer type like `ArrayPoolBufferWriter<T>`: this means that as soon as we get past tha threshold, we'll basically end up resizing the whole array for every single new write operation, no matter how large it is. That's super bad for performance and memory usage 🥺 ## What is the new behavior? <!-- Describe how was this issue resolved or changed? --> The solution for this is pretty simple, this PR includes a simple check for the requested size, and if that's over `1024 * 1024` it just rounds that up to the closest power of 2, so that the array size will effectively just keep being multiplied by 2 every time. This has a **huge** performance impact when eg. trying to use the `ArrayPoolBufferWriter<T>` class to write a 10MB buffer, 8KB at a time: | Method | Mean | Error | StdDev | Ratio | Gen 0 | Gen 1 | Gen 2 | Allocated | |-------------------------- |-------------:|-----------:|-----------:|------:|------------:|------------:|------------:|-----------:| | Before | 1,380.236 ms | 19.9806 ms | 17.7122 ms | 1.000 | 355000.0000 | 355000.0000 | 355000.0000 | 5754.44 MB | | After | **8.820 ms** | 0.1757 ms | 0.4838 ms | **0.006** | 640.6250 | 640.6250 | 640.6250 | **30 MB** | In this simple benchmark alone, the updated version is **156x** faster and uses **190x less memory** 😄 Of course, results will vary a lot on the specific workload, but you can imagine the impact being even more dramatic when working with larger buffers, or with less items being written at any given time. With this change in general, users will not have to worry about the size of the data being written, and the class will automatically use the right approach in all cases. ## PR Checklist Please check if your PR fulfills the following requirements: - [X] Tested code with current [supported SDKs](../readme.md#supported) - [ ] ~~Pull Request has been submitted to the documentation repository [instructions](..\contributing.md#docs). Link: <!-- docs PR link -->~~ - [ ] ~~Sample in sample app has been added / updated (for bug fixes / features)~~ - [ ] ~~Icon has been created (if new sample) following the [Thumbnail Style Guide and templates](https://github.com/windows-toolkit/WindowsCommunityToolkit-design-assets)~~ - [X] Tests for the changes have been added (for bug fixes / features) (if applicable) - [X] Header has been added to all new source files (run *build/UpdateHeaders.bat*) - [X] Contains **NO** breaking changes
2 parents 846afb2 + 0aff126 commit d434f43

File tree

5 files changed

+120
-36
lines changed

5 files changed

+120
-36
lines changed

Microsoft.Toolkit.HighPerformance/Buffers/ArrayPoolBufferWriter{T}.cs

Lines changed: 30 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@
1010
using System.Runtime.InteropServices;
1111
using Microsoft.Toolkit.HighPerformance.Buffers.Views;
1212
using Microsoft.Toolkit.HighPerformance.Extensions;
13+
using Microsoft.Toolkit.HighPerformance.Helpers.Internals;
1314

1415
namespace Microsoft.Toolkit.HighPerformance.Buffers
1516
{
@@ -233,15 +234,15 @@ public void Advance(int count)
233234
/// <inheritdoc/>
234235
public Memory<T> GetMemory(int sizeHint = 0)
235236
{
236-
CheckAndResizeBuffer(sizeHint);
237+
CheckBufferAndEnsureCapacity(sizeHint);
237238

238239
return this.array.AsMemory(this.index);
239240
}
240241

241242
/// <inheritdoc/>
242243
public Span<T> GetSpan(int sizeHint = 0)
243244
{
244-
CheckAndResizeBuffer(sizeHint);
245+
CheckBufferAndEnsureCapacity(sizeHint);
245246

246247
return this.array.AsSpan(this.index);
247248
}
@@ -251,9 +252,11 @@ public Span<T> GetSpan(int sizeHint = 0)
251252
/// </summary>
252253
/// <param name="sizeHint">The minimum number of items to ensure space for in <see cref="array"/>.</param>
253254
[MethodImpl(MethodImplOptions.AggressiveInlining)]
254-
private void CheckAndResizeBuffer(int sizeHint)
255+
private void CheckBufferAndEnsureCapacity(int sizeHint)
255256
{
256-
if (this.array is null)
257+
T[]? array = this.array;
258+
259+
if (array is null)
257260
{
258261
ThrowObjectDisposedException();
259262
}
@@ -268,12 +271,32 @@ private void CheckAndResizeBuffer(int sizeHint)
268271
sizeHint = 1;
269272
}
270273

271-
if (sizeHint > FreeCapacity)
274+
if (sizeHint > array!.Length - this.index)
272275
{
273-
int minimumSize = this.index + sizeHint;
276+
ResizeBuffer(sizeHint);
277+
}
278+
}
274279

275-
this.pool.Resize(ref this.array, minimumSize);
280+
/// <summary>
281+
/// Resizes <see cref="array"/> to ensure it can fit the specified number of new items.
282+
/// </summary>
283+
/// <param name="sizeHint">The minimum number of items to ensure space for in <see cref="array"/>.</param>
284+
[MethodImpl(MethodImplOptions.NoInlining)]
285+
private void ResizeBuffer(int sizeHint)
286+
{
287+
int minimumSize = this.index + sizeHint;
288+
289+
// The ArrayPool<T> class has a maximum threshold of 1024 * 1024 for the maximum length of
290+
// pooled arrays, and once this is exceeded it will just allocate a new array every time
291+
// of exactly the requested size. In that case, we manually round up the requested size to
292+
// the nearest power of two, to ensure that repeated consecutive writes when the array in
293+
// use is bigger than that threshold don't end up causing a resize every single time.
294+
if (minimumSize > 1024 * 1024)
295+
{
296+
minimumSize = BitOperations.RoundUpPowerOfTwo(minimumSize);
276297
}
298+
299+
this.pool.Resize(ref this.array, minimumSize);
277300
}
278301

279302
/// <inheritdoc/>

Microsoft.Toolkit.HighPerformance/Buffers/MemoryOwner{T}.cs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,11 @@ public MemoryOwner<T> Slice(int start, int length)
244244
ThrowInvalidLengthException();
245245
}
246246

247+
// We're transferring the ownership of the underlying array, so the current
248+
// instance no longer needs to be disposed. Because of this, we can manually
249+
// suppress the finalizer to reduce the overhead on the garbage collector.
250+
GC.SuppressFinalize(this);
251+
247252
return new MemoryOwner<T>(start, length, this.pool, array!);
248253
}
249254

Microsoft.Toolkit.HighPerformance/Buffers/StringPool.cs

Lines changed: 3 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,13 @@
55
using System;
66
using System.Diagnostics.CodeAnalysis;
77
using System.Diagnostics.Contracts;
8-
#if NETCOREAPP3_1
9-
using System.Numerics;
10-
#endif
118
using System.Runtime.CompilerServices;
129
using System.Text;
1310
using Microsoft.Toolkit.HighPerformance.Extensions;
1411
#if !NETSTANDARD1_4
1512
using Microsoft.Toolkit.HighPerformance.Helpers;
1613
#endif
14+
using BitOperations = Microsoft.Toolkit.HighPerformance.Helpers.Internals.BitOperations;
1715

1816
#nullable enable
1917

@@ -79,8 +77,8 @@ static void FindFactors(int size, int factor, out int x, out int y)
7977
a = Math.Sqrt((double)size / factor),
8078
b = factor * a;
8179

82-
x = RoundUpPowerOfTwo((int)a);
83-
y = RoundUpPowerOfTwo((int)b);
80+
x = BitOperations.RoundUpPowerOfTwo((int)a);
81+
y = BitOperations.RoundUpPowerOfTwo((int)b);
8482
}
8583

8684
// We want to find two powers of 2 factors that produce a number
@@ -130,30 +128,6 @@ static void FindFactors(int size, int factor, out int x, out int y)
130128
Size = p2;
131129
}
132130

133-
/// <summary>
134-
/// Rounds up an <see cref="int"/> value to a power of 2.
135-
/// </summary>
136-
/// <param name="x">The input value to round up.</param>
137-
/// <returns>The smallest power of two greater than or equal to <paramref name="x"/>.</returns>
138-
[Pure]
139-
[MethodImpl(MethodImplOptions.AggressiveInlining)]
140-
private static int RoundUpPowerOfTwo(int x)
141-
{
142-
#if NETCOREAPP3_1
143-
return 1 << (32 - BitOperations.LeadingZeroCount((uint)(x - 1)));
144-
#else
145-
x--;
146-
x |= x >> 1;
147-
x |= x >> 2;
148-
x |= x >> 4;
149-
x |= x >> 8;
150-
x |= x >> 16;
151-
x++;
152-
153-
return x;
154-
#endif
155-
}
156-
157131
/// <summary>
158132
/// Gets the shared <see cref="StringPool"/> instance.
159133
/// </summary>
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
// Licensed to the .NET Foundation under one or more agreements.
2+
// The .NET Foundation licenses this file to you under the MIT license.
3+
// See the LICENSE file in the project root for more information.
4+
5+
using System.Diagnostics.Contracts;
6+
using System.Runtime.CompilerServices;
7+
#if NETCOREAPP3_1
8+
using static System.Numerics.BitOperations;
9+
#endif
10+
11+
namespace Microsoft.Toolkit.HighPerformance.Helpers.Internals
12+
{
13+
/// <summary>
14+
/// Utility methods for intrinsic bit-twiddling operations. The methods use hardware intrinsics
15+
/// when available on the underlying platform, otherwise they use optimized software fallbacks.
16+
/// </summary>
17+
internal static class BitOperations
18+
{
19+
/// <summary>
20+
/// Rounds up an <see cref="int"/> value to a power of 2.
21+
/// </summary>
22+
/// <param name="x">The input value to round up.</param>
23+
/// <returns>The smallest power of two greater than or equal to <paramref name="x"/>.</returns>
24+
[Pure]
25+
[MethodImpl(MethodImplOptions.AggressiveInlining)]
26+
public static int RoundUpPowerOfTwo(int x)
27+
{
28+
#if NETCOREAPP3_1
29+
return 1 << (32 - LeadingZeroCount((uint)(x - 1)));
30+
#else
31+
x--;
32+
x |= x >> 1;
33+
x |= x >> 2;
34+
x |= x >> 4;
35+
x |= x >> 8;
36+
x |= x >> 16;
37+
x++;
38+
39+
return x;
40+
#endif
41+
}
42+
}
43+
}

UnitTests/UnitTests.HighPerformance.Shared/Buffers/Test_ArrayPoolBufferWriter{T}.cs

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
using System.Diagnostics.CodeAnalysis;
77
using System.IO;
88
using System.Linq;
9+
using System.Reflection;
910
using Microsoft.Toolkit.HighPerformance.Buffers;
1011
using Microsoft.Toolkit.HighPerformance.Extensions;
1112
using Microsoft.VisualStudio.TestTools.UnitTesting;
@@ -17,6 +18,44 @@ namespace UnitTests.HighPerformance.Buffers
1718
[SuppressMessage("StyleCop.CSharp.DocumentationRules", "SA1649", Justification = "Test class for generic type")]
1819
public class Test_ArrayPoolBufferWriterOfT
1920
{
21+
[TestCategory("ArrayPoolBufferWriterOfT")]
22+
[TestMethod]
23+
[DataRow(0, 256)] // 256 is the default initial size for ArrayPoolBufferWriter<T>
24+
[DataRow(4, 256)]
25+
[DataRow(7, 256)]
26+
[DataRow(27, 256)]
27+
[DataRow(188, 256)]
28+
[DataRow(257, 512)]
29+
[DataRow(358, 512)]
30+
[DataRow(799, 1024)]
31+
[DataRow(1024, 1024)]
32+
[DataRow(1025, 2048)]
33+
[DataRow((1024 * 1024) - 1, 1024 * 1024)]
34+
[DataRow(1024 * 1024, 1024 * 1024)]
35+
[DataRow((1024 * 1024) + 1, 2 * 1024 * 1024)]
36+
[DataRow(2 * 1024 * 1024, 2 * 1024 * 1024)]
37+
[DataRow((2 * 1024 * 1024) + 1, 4 * 1024 * 1024)]
38+
[DataRow(3 * 1024 * 1024, 4 * 1024 * 1024)]
39+
public void Test_ArrayPoolBufferWriterOfT_BufferSize(int request, int expected)
40+
{
41+
using var writer = new ArrayPoolBufferWriter<byte>();
42+
43+
// Request a Span<T> of a specified size and discard it. We're just invoking this
44+
// method to force the ArrayPoolBufferWriter<T> instance to internally resize the
45+
// buffer to ensure it can contain at least this number of items. After this, we
46+
// can use reflection to get the internal array and ensure the size equals the
47+
// expected one, which matches the "round up to power of 2" logic we need. This
48+
// is documented within the resize method in ArrayPoolBufferWriter<T>, and it's
49+
// done to prevent repeated allocations of arrays in some scenarios.
50+
_ = writer.GetSpan(request);
51+
52+
var arrayFieldInfo = typeof(ArrayPoolBufferWriter<byte>).GetField("array", BindingFlags.Instance | BindingFlags.NonPublic);
53+
54+
byte[] array = (byte[])arrayFieldInfo!.GetValue(writer);
55+
56+
Assert.AreEqual(array!.Length, expected);
57+
}
58+
2059
[TestCategory("ArrayPoolBufferWriterOfT")]
2160
[TestMethod]
2261
public void Test_ArrayPoolBufferWriterOfT_AllocateAndGetMemoryAndSpan()

0 commit comments

Comments
 (0)