Skip to content

Introduce size-optimized IListSelect iterator #118156

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

agocke
Copy link
Member

@agocke agocke commented Jul 29, 2025

This lets us keep some of the constant-time indexing advantages of the IList iterator, without the GVM overhead of Select. There is a small size increase here, but nowhere near the cost of the GVM.

In a pathological generated example for GVMs the cost was:

  1. .NET 9: 12 MB
  2. .NET 10 w/out this change: 2.2 MB
  3. .NET 10 w/ this change: 2.3 MB

In a real-world example (AzureMCP), the size attributed to System.Linq was:

  1. .NET 9: 1.2 MB
  2. .NET 10 w/out this change: 340 KB
  3. .NET 10 w/ this change: 430 KB

This seems like a good tradeoff. We mostly keep the algorithmic complexity the same across the size/speed-opt versions, and just tradeoff on the margins. We could probably continue to improve this in the future.

Fixes #115033

This lets us keep some of the constant-time indexing advantages of the
IList iterator, without the GVM overhead of Select. There is a small
size increase here, but nowhere near the cost of the GVM.

In a pathological generated example for GVMs the cost was:

  1. .NET 9: 12 MB
  2. .NET 10 w/out this change: 2.2 MB
  3. .NET 10 w/ this change: 2.3 MB

In a real-world example (AzureMCP), the size attributed to System.Linq
was:

  1. .NET 9: 1.2 MB
  2. .NET 10 w/out this change: 340 KB
  3. .NET 10 w/ this change: 430 KB

This seems like a good tradeoff. We mostly keep the algorithmic
complexity the same across the size/speed-opt versions, and just
tradeoff on the margins. We could probably continue to improve this in
the future.
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-linq
See info in area-owners.md if you want to be subscribed.

@agocke agocke marked this pull request as ready for review July 29, 2025 20:32
@Copilot Copilot AI review requested due to automatic review settings July 29, 2025 20:32
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a size-optimized IList-based Select iterator to reduce binary size while maintaining constant-time indexing advantages. The change removes the size-optimized variants of Take and Skip operations in favor of a more targeted optimization for Select operations on IList collections.

Key changes:

  • Introduces SizeOptIListSelectIterator for size-optimized Select operations on IList collections
  • Removes size-optimized Take and Skip iterator implementations
  • Updates Take and Skip to always use speed-optimized variants

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/libraries/System.Linq/src/System/Linq/Take.cs Removes conditional size optimization, always uses speed-optimized iterator
src/libraries/System.Linq/src/System/Linq/Take.SizeOpt.cs Removes size-optimized Take iterator implementation
src/libraries/System.Linq/src/System/Linq/Skip.cs Removes conditional size optimization, always uses speed-optimized iterator
src/libraries/System.Linq/src/System/Linq/Skip.SizeOpt.cs Completely removes file containing size-optimized Skip iterator
src/libraries/System.Linq/src/System/Linq/Select.cs Adds IList detection and uses new size-optimized IList iterator
src/libraries/System.Linq/src/System/Linq/Iterator.SizeOpt.cs New file implementing size-optimized IList Select iterator
src/libraries/System.Linq/src/System.Linq.csproj Updates project to include new Iterator.SizeOpt.cs and remove Skip.SizeOpt.cs

@agocke
Copy link
Member Author

agocke commented Aug 1, 2025

/azp run runtime-nativeaot-outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

This takes a more aggressive direction and removes the size
optimized versions of iterators for Skip and Take. As far as
I can tell these are relatively small size increases, but
using them preserves the O(1) optimizations in the '
speed' version.
@agocke
Copy link
Member Author

agocke commented Aug 2, 2025

Fixed to be slightly more aggressive -- while doing more thorough testing I found more code paths that went from O(1) to O(n) with sizeopt enabled. I've fixed them by just removing sizeopt for Take and Skip. This doesn't seem to seriously hurt the binary size. I bet there some small improvements we can make (trying to fold array and IList into the same codepath), but this seems like a pretty good middle ground.

@agocke
Copy link
Member Author

agocke commented Aug 2, 2025

Also, I added a unit test for the Select().Count() case you mentioned. Couldn't find an existing one.

@agocke agocke requested a review from MichalStrehovsky August 2, 2025 02:48
@reflectronic
Copy link
Contributor

reflectronic commented Aug 2, 2025

There are similar complexity issues with Count, ElementAt, Contains, and Last. They still persist with this PR, right?

@agocke
Copy link
Member Author

agocke commented Aug 2, 2025

There are similar complexity issues with Count, ElementAt, Contains, and Last. They still persist with this PR, right?

Yes. I haven't tried to walk through every scenario in this PR, just one Select/Take/Skip scenario.

@agocke
Copy link
Member Author

agocke commented Aug 2, 2025

Also, surprised Contains could have this problem, since it's inherently O(n). But, I haven't looked at that code.

@reflectronic
Copy link
Contributor

HashSet<int> x = [...];
HashSet<int> y = [...];

bool contains = x.Union(y).Contains(100);

UnionIterator2<int>.Contains has complexity x.Contains(100) || y.Contains(100), which is O(1)

@reflectronic
Copy link
Contributor

If the complexity has to be fixed for all of those methods, there isn't much left in Iterator.SpeedOpt.cs: just ToList, ToArray, and TryGetFirst. Pessimizing just those operators seems a little unbalanced--is that even worth it?

I think the biggest issue here is the quadratic growth with Select. That has led to serious customer failures (#102131). How bad are the regressions if you remove all the size optimizations except non-GVM Select? That seems like it reaps most of the size benefits without "playing performance games."

@agocke
Copy link
Member Author

agocke commented Aug 2, 2025

How bad are the regressions if you remove all the size optimizations except non-GVM Select? That seems like it reaps most of the size benefits without "playing performance games."

I agree, but I’d still rather do the other ones in a different PR.

UnionIterator2.Contains

Ah I didn’t realize there was special support for sets.

@MichalStrehovsky
Copy link
Member

I agree, but I’d still rather do the other ones in a different PR.

I think we'd want to evaluate the overall tradeoff here instead of doing several individual papercuts. Take all the changes we'd want to make here, evaluate if it still makes the difference between sizeopt/speedopt meaningful, and whether we reached the stated goal (no differences in algorithmic complexity between sizeopt and speedopt). The changes can still go in through individual PRs for reviewability, but the full scope of the regression and achievability of the stated goal is important. If we cannot achieve 100% same algorithmic complexity or if the sizeof/speedopt size difference becomes insignificant, it is a very different conversation.

For example the first iteration of this PR was a 0.3% size regression for WebApiAot. The second iteration is already at 1.3%. How much more can we expect? We know the impact to throughput is going to be zero, because I measured it in the past.

We'd want similar numbers for Mono iDevices/Android/WASM, since this is regressing those too and that regression would be a version-to-version regression. It might also be a tougher sell because we received zero TP complaints over the years this was the default and in .NET 10 we make it possible for the user to just disable this (it was not even possible to disable in the past so if someone raised this, we'd not be able to offer a solution).

@MichalStrehovsky
Copy link
Member

How bad are the regressions if you remove all the size optimizations except non-GVM Select? That seems like it reaps most of the size benefits without "playing performance games."

We have those numbers for native AOT in various places. If you're in the pathological case, the GVM impact is the biggest contributor: #109978 (comment). If you're in the non-pathological case the non-GVM expansion dominates (e.g. #109978 (comment) is Avalonia without the GVM fix - about 1 MB; the GVM bypass saves another 300 kB).

For non-Native AOT scenarios, I don't think GVM bypass does anything because Mono doesn't try to analyze GVMs and falls back to universal slow generic code.

We could reconsider placing the GVM bypass behind a separate feature switch but that runs into problems discussed in #109978 (comment). The GVM bypass has interactions with SizeOpt, so an app will behave differently if you enable one, the other, or both. And I don't know how we'd doc two feature switches that do various tradeoffs to LINQ (and set different defaults based on platforms).

Sometimes simpler is just better. I'd not lose sleep over won't fixing #115033 despite the heated conversation.

@agocke
Copy link
Member Author

agocke commented Aug 5, 2025

I tested out my suggestion above: folding List, Array, and IList for Select all into IList seems to have significant benefits. It seems to win back ~30% of the difference. I'd still rather do those changes in a separate PR: if we have to revert any of these PRs, I'd rather revert only one.

I'm still looking at the other places where we separate List/Array/IList, like Where.

agocke added 3 commits August 8, 2025 16:29
These are more (relatively common) cases where you could end up with an
O(n) implementation instead of O(1) even when the backing enumerable is
capable of doing O(1) index access.
@agocke
Copy link
Member Author

agocke commented Aug 9, 2025

Added a number of commits here:

  • A huge amount of the extra size seemed to be not in LINQ, but equality comparers. That seemed to come through via a new path to OfType. But the IList use in OfType was easy to skip via the feature switch.
  • I addressed what I saw as the high-priority O(1) -> O(n) cases. I think everything that remains we can probably leave alone
  • To gain back some of the size I trimmed away specializations for Array and List, vs. just IList.

My local testing is that the webaot project goes from ~100 KB extra to only ~40 KB extra. I haven't yet re-run the test for azure MCP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Massive performance regression .NET 9 vs .NET 10
4 participants