`cap-mbps` redux by adreed-msft · Pull Request #3378 · Azure/azure-storage-azcopy

adreed-msft · 2026-02-20T01:18:12Z

Description

There are some issues with our current standing pacing mechanism that lead to request starvation with slow networks/low requested bandwidth. First, requests get flighted based upon the number of currently live threads. If a bandwidth limit is applied, request bodies are wrapped and request bytes by pulling them from a frequently refreshed "token bucket", an atomic int64 with a worker routine constantly feeding it bytes. Due to this first-come-first-serve basis, if there's too many requests, some requests will never be able to get bandwidth allocated in a reasonable amount of time, which will result in the request getting dropped.

To that end, the pacing mechanism has been reworked slightly. Another vector of gating is applied in that we prevent requests from going out the door until we're certain we can allocate them a reasonable minimum amount of bandwidth. Atop this, bandwidth allocation is no longer handled first-come-first-serve but instead is handled in a structured manner. Requests that have been given the go-live receive amounts of bandwidth, as a portion of the requested maximum bandwidth.

So far, this does seem to be working to correctly limit bandwidth, but, there's some issues with my implementation I'm working through still. Hence, this lives in draft status to demonstrate work done.

Still, too many requests can be let through the gates-- while I don't get service timeouts anymore under the new code, the requests do live for... quite a while, and sometimes wind up having their network connections get closed (property of network equipment in between?) Either way, less than optimal. The mechanism for gating requests is very immature and was a naiive implementation where we'd try to divvy things up close to Azure's minimum allowed throughput. My plan for this is to graft the concurrency tuner onto this, and combine the efforts somewhat, given it also acts on a mechanism of scaling against concurrent requests. Combining these efforts also comes with the benefit of not having a bunch of effectively dead threads waiting around.
DONE, as of 5fda4d7 Retries-- Because the new pacer was grafted directly over the old code, a request body continues to live through multiple retries, resulting in the stream getting seeked back to 0, and bypassing the request gating mechanism. This is a simple enough fix, fortunately. I'm inclined to believe I can apply this gating via a post retry policy, and it should work fine.
Hiccups in throughput-- Because of the mechanism this acts on, many requests wind up completing at the same time, and the ticker takes another second to accept new requests. The main thing I'm thinking here is that I should add some randomness to the allocations, such that requests complete at different times and sort of blend together for the sake of throughput consistency. There's another case I've observed where there's maybe a bug with the early clear mechanism that can lead to an overage of the requested bandwidth limit.
Debating on whether or not to keep the bandwidth detector. The implementation _works, but is kinda unnecessary compared to just using AzCopy's native bandwidth detection. I'd expected that something a bit more real-time would be useful for this, but I'm not so sure at this point.

Type of Change

How Has This Been Tested?

Currently, very manual testing. The bucket rotator / bandwidth detector has some formal unit tests, but I'm not sure whether that component is staying.

adreed-msft added 2 commits February 19, 2026 17:10

99% of way to PoC (rewriting request impl first)

525ead7

Policy based request allocation

5fda4d7

adreed-msft changed the title ~~Adreed/cap mbps redux~~ cap-mbps redux Feb 20, 2026

adreed-msft requested a review from shnayak-msft February 24, 2026 00:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`cap-mbps` redux#3378

`cap-mbps` redux#3378
adreed-msft wants to merge 2 commits intomainfrom
adreed/cap-mbps-redux

adreed-msft commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adreed-msft commented Feb 20, 2026

Description

Related Links

Type of Change

How Has This Been Tested?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant