PoC: convert middleware steps to linked lists #617
+1,232
−223
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.


[DRAFT]
Relates aws/aws-sdk-go-v2#3197
Middleware building today requires that we allocate two structures per step: a slice of IDs that maintains order, and a mapping of IDs to middlewares. Then at execution time, we build a third structure using those two - we synthesize an ordered list of middlewares, and wrap them in reverse to create the final decorated step.
The act of doing that across all 5 steps results in a flat block of allocation/memory use that accounts for roughly 10% of the space and object allocations in a typical operation:
Instead of doing all that, we can just not and instead "decorate" the middleware stack in real time as Add/Swap/etc. are called by treating the phase as a linked list of sorts.
If you re-profile after making this change the benefit is immediate: notice how we've gone from 5 sub-blocks in NewStack to 4 (well 5 but one is very small) because the entire slice/map overhead is just gone.
So in theory, if we apply this change across all steps, we're dropping 10% of allocations/memory use that we've carried since the SDK went GA.
TODO:
Profile the speed of this.Try and make this work with generics, otherwise we'll have to copy this code for each of the 5 phases. It may not be possible though.