Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 9, 2026

Description

File system enumeration was re-checking pattern types on every entry to determine if fast paths apply (*literal → EndsWith). This moves pattern analysis to enumerable creation time and creates specialized delegates upfront.

Changes:

  • Add GetPredicate<T> that analyzes patterns once and returns optimized FileSystemEnumerable<T>.FindPredicate delegates:
    • * → always-true predicate (with IsDirectory/IsFile check)
    • *literalEndsWith (existing optimization, moved earlier)
    • literal*StartsWith (new)
    • *literal*Contains (new)
  • Complex patterns fall back to full NFA-based matching
  • Extract wildcard constants as SearchValues<char> (SimpleWildcards, ExtendedWildcards) to internal static readonly fields in FileSystemName for shared use
  • Combine IsDirectory/IsFile check with pattern matcher into a single delegate invocation (avoids double delegate call)
  • Capture only expression in lambdas and compute span slices inline on each invocation to avoid string allocations and minimize capture overhead
  • Switch over (useExtendedWildcards, entryType) tuple to return distinct delegates without capturing unnecessary variables

Customer Impact

Performance improvement for file enumeration with common glob patterns. Each file entry now calls a simple string operation instead of re-evaluating pattern structure and potentially running the full matching algorithm.

Regression

No. This is a performance optimization.

Testing

  • All 9189 existing FileSystem tests pass
  • Added 30 new test cases for literal* (StartsWith) and *literal* (Contains) patterns in FileSystemNameTests.cs
  • Added 3 integration tests in PatternTransformTests.cs exercising actual file enumeration with optimized patterns

Risk

Low. The optimization preserves existing behavior by falling back to full pattern matching for any pattern not matching the fast paths. All existing tests pass unchanged.

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

Original prompt

When doing file system enumeration, every entry returned by the is is then checked to see whether it matches the caller-provided pattern. That's handled by a delegate that eventually calls into a Match Pattern routine. That routine has special-case, eg if the pattern is *literal it'll just do EndsWith. But that means that on every entry it's rechecking whether any of those fast paths apply. Instead, the entry points that create those delegates could create different delegates for each of the most important special-case, such that each call doesn't need to recheck the condition. It'll also be good to add additional special-case. For example, right now it special cases *literal to use EndsWith, but not literal* to use Starts With, and not *literal* to use Contains. Please optimize all of these things.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Copilot AI changed the title [WIP] Optimize file system enumeration pattern matching Optimize file system enumeration pattern matching with specialized delegates Jan 9, 2026
Copilot AI requested a review from stephentoub January 9, 2026 20:13
…, combine IsDirectory check with matcher

Co-authored-by: stephentoub <[email protected]>
@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
[MemoryDiagnoser(false)]
public partial class Tests
{
    private string _directoryPath;

    [GlobalSetup]
    public void Setup()
    {
        _directoryPath = Path.GetTempFileName();
        File.Delete(_directoryPath);

        Directory.CreateDirectory(_directoryPath);
        for (int i = 0; i < 10_000; i++)
        {
            File.Create(Path.Combine(_directoryPath, $"file{i}.txt")).Dispose();
        }
    }

    [Benchmark]
    public int Sum1()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath)) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum2()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "file*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum3()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*.txt")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum4()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*12*")) sum += path.Length;
        return sum;
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        Directory.Delete(_directoryPath, true);
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants