Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Oct 6, 2025

  • Review current implementation of BrotliUtils.GetQualityFromCompressionLevel
  • Change CompressionLevel.Fastest mapping from quality 1 to quality 2
  • Build the library to ensure no compilation errors
  • Run existing tests to ensure compatibility
  • Verify the change resolves the issue with small writes
  • Add comment explaining why quality 2 is used instead of 0 or 1
Original prompt

This section details on the original issue you should resolve

<issue_title>Brotli very inefficient with smaller writes</issue_title>
<issue_description>When writing to a Brotli stream one line at a time, the compressed output is actually larger than the uncompressed input.

Repro:

using System;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace BrotliTest
{
    class Program
    {
        static async Task Main(string[] args)
        {
            const int NUMBER_OF_LINES = 10000;
            const string SAMPLE_STRING = "hello this should compress well\n";

            var inputString = string.Concat(Enumerable.Repeat(SAMPLE_STRING, NUMBER_OF_LINES));
            var inputBytes = Encoding.UTF8.GetBytes(inputString);
            Console.WriteLine($"Input size: {inputBytes.Length} bytes");

            using (var outputMemory = new MemoryStream())
            using (var outputStream = new BrotliStream(outputMemory, CompressionLevel.Fastest))
            {
                await outputStream.WriteAsync(inputBytes);
                Console.WriteLine($"Output size (all at once): {outputMemory.Length} bytes");
            }

            using (var outputMemory = new MemoryStream())
            using (var outputStream = new BrotliStream(outputMemory, CompressionLevel.Fastest))
            {
                var bytes = Encoding.UTF8.GetBytes(SAMPLE_STRING);
                for (var i = 0; i < NUMBER_OF_LINES; i++)
                {
                    await outputStream.WriteAsync(bytes);
                }
                Console.WriteLine($"Output size (line by line): {outputMemory.Length} bytes");
            }

            using (var outputMemory = new MemoryStream())
            using (var outputStream = new BrotliStream(outputMemory, CompressionLevel.Fastest))
            {
                foreach (var inputByte in inputBytes)
                {
                    await outputStream.WriteAsync(new[] { inputByte });
                }

                Console.WriteLine($"Output size (byte by byte): {outputMemory.Length} bytes");
            }

            Console.ReadKey();
        }
    }
}

Output:

Input size: 320000 bytes
Output size (all at once): 105 bytes
Output size (line by line): 350000 bytes
Output size (byte by byte): 1280000 bytes

Buffering the entire contents in memory then writing it all in a single write avoids the issue, but then that defeats the purpose of using a stream (may as well just use a function that compresses a byte array, and totally avoid streams).</issue_description>

<agent_instructions>Change Brotli to use Quality 2 for CompressionLevel.Fastest.</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@stephentoub > Adding an internal buffer to BrotliStream, as has been done with other streams in the past, would be a good start.

This is why BufferedStream exists. With the main exception of FileStream, which has a buffer purely for perf (but where we generally regret the inclusion of the buffer directly), other streams that include buffers typically do so out of necessity (e.g. there's a minimum size required to perform its transformation) rather than for perf. </comment_new>
<comment_new>@stephentoub
There's a difference between being functional and having great performance. Lots of things improve on throughput when input is buffered and processed in larger chunks. For example, every write to a NetworkStream involves a syscall; that doesn't mean we should add a buffer into NetworkStream. If you want your input buffered, you add buffering on top.</comment_new>
<comment_new>@stephentoub

this is a functionality issue

It's doing what the user asked: compressing input at the level requested with the data requested at the time requested. There might be other things that can be improved, but forcing input buffering as part of the stream implementation itself is not the answer. With many compression algorithms, changing the chunk size improves compression ratio; it's not the job of BrotliStream to decide what an acceptable min compression ratio is... its job is to provide a Stream facade on top of the BrotliEncoder/Decoder. And as @svick outlined, changing the compression level improves things, as does using BufferedStream, whose entire reason to exist is to make it easy to layer on buffering to improve performance, whether that performance be about throughout or compression ratios.</comment_new>
<comment_new>@stephentoub

I would have assumed that a compression stream buffers internally as much data as is required to achieve optimum compression in accordance to the level set.

Achieving the optimal level of compression might require buff...

Fixes #36245

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@Copilot Copilot AI changed the title [WIP] Brotli very inefficient with smaller writes Change Brotli CompressionLevel.Fastest to use quality 2 instead of 1 Oct 6, 2025
@Copilot Copilot AI requested a review from rzikm October 6, 2025 10:17
Copilot finished work on behalf of rzikm October 6, 2025 10:17
Copy link
Member

@MihaZupan MihaZupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.
I'd just add a comment for why we don't want to use 0 or 1 in this case.

@rzikm
Copy link
Member

rzikm commented Oct 7, 2025

@copilot address review comments

Copy link
Contributor Author

Copilot AI commented Oct 7, 2025

Seems reasonable to me.
I'd just add a comment for why we don't want to use 0 or 1 in this case.

Added a comment explaining why quality 2 is used instead of 0 or 1 for CompressionLevel.Fastest. The comment explains that qualities 0 and 1 produce poor compression for incremental writes, often resulting in output larger than the uncompressed input. (48e9b7d)

Copilot finished work on behalf of rzikm October 7, 2025 06:20
@rzikm rzikm marked this pull request as ready for review October 7, 2025 06:30
@Copilot Copilot AI review requested due to automatic review settings October 7, 2025 06:30
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adjusts the internal Brotli quality mapping for CompressionLevel.Fastest to improve compression effectiveness for incremental (small chunk) writes without requiring external buffering.

  • Map CompressionLevel.Fastest from quality 1 to quality 2
  • Add explanatory inline comment documenting the rationale

@rzikm rzikm enabled auto-merge (squash) October 7, 2025 06:31
@rzikm rzikm disabled auto-merge October 7, 2025 06:31
@rzikm rzikm added the breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. label Oct 7, 2025
@dotnet-policy-service dotnet-policy-service bot added the needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet label Oct 7, 2025
@rzikm rzikm removed breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. needs-breaking-change-doc-created Breaking changes need an issue opened with https://github.com/dotnet/docs/issues/new?template=dotnet labels Oct 10, 2025
@rzikm
Copy link
Member

rzikm commented Oct 10, 2025

/ba-g Test failures are unrelated

@rzikm rzikm merged commit 9fb66db into main Oct 10, 2025
90 of 92 checks passed
@akoeplinger akoeplinger deleted the copilot/fix-af986064-2b8a-446b-8c11-7a5b8f412b6b branch October 11, 2025 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Brotli very inefficient with smaller writes

3 participants