Skip to content

Conversation

@lucix-aws
Copy link
Contributor

@lucix-aws lucix-aws commented Nov 13, 2025

[DRAFT]

Relates aws/aws-sdk-go-v2#3197

Middleware building today requires that we allocate two structures per step: a slice of IDs that maintains order, and a mapping of IDs to middlewares. Then at execution time, we build a third structure using those two - we synthesize an ordered list of middlewares, and wrap them in reverse to create the final decorated step.

The act of doing that across all 5 steps results in a flat block of allocation/memory use that accounts for roughly 10% of the space and object allocations in a typical operation:

image

Instead of doing all that, we can just not and instead "decorate" the middleware stack in real time as Add/Swap/etc. are called by treating the phase as a linked list of sorts.

If you re-profile after making this change the benefit is immediate: notice how we've gone from 5 sub-blocks in NewStack to 4 (well 5 but one is very small) because the entire slice/map overhead is just gone.

image

So in theory, if we apply this change across all steps, we're dropping 10% of allocations/memory use that we've carried since the SDK went GA.

TODO:

  • Profile the speed of this.
  • Try and make this work with generics, otherwise we'll have to copy this code for each of the 5 phases. It may not be possible though.
    • It is in fact not possible. With generics you end up having to adapt each phase at least twice which blows up the call stack even more, and you end up netting more allocations. The potential for improvement is just too large to turn away from here so instead we're now code-generating each of these phases.
  • Write unit tests against this new code, then port those tests back to the release and use them to eventually verify this behavior.

@lucix-aws lucix-aws requested review from a team as code owners November 13, 2025 18:32
@lucix-aws lucix-aws marked this pull request as draft November 13, 2025 18:32
return s.ids.Insert(m, relativeTo, pos)
var prev, found *decoratedInitializeHandler
for h := s.head; h != nil; {
if h.With.ID() == relativeTo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catching to change that check from m.ID to relativeTo

@lucix-aws
Copy link
Contributor Author

Re-profiled using https://github.com/lucix-aws/aws-sdk-go-v2-allocations post-codegen.

Before: (typically ~340-345)
image

After (this is the best run I could get, typically ~305-310):
image

Amusingly the addOperationMiddlewares portion grows back up to now constitute the same 20% of the overhead, despite it being an overall reduction.

Cumulative space also drops from ~32GiB to ~27, which makes sense since we're getting rid of 2 slices and a map.

// *initializeWrapHandler, make sure to check for that
if hnext, ok := h.Next.(*decoratedInitializeHandler); ok {
h = hnext
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need an else block here to break out after tail, seems like line 252 doesn't break if tail.Next is an *initializeWrapHandler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants