- 
                Notifications
    You must be signed in to change notification settings 
- Fork 143
rework on poller auto scaler #1411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework on poller auto scaler #1411
Conversation
82e6bb6    to
    636c433      
    Compare
  
    | Codecov ReportAttention: Patch coverage is  
 
 Continue to review full report in Codecov by Sentry. 
 🚀 New features to boost your workflow:
 | 
| to stick it in here too: overall looks pretty good. simpler and the overall goal (and why it achieves it) is clearer too. seems like just minor tweaks (many optional) and it's probably good to go | 
8438696    to
    69f7e3f      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, but I left some nits
        
          
                internal/internal_worker_base.go
              
                Outdated
          
        
      |  | ||
| if bw.pollerAutoScaler != nil { | ||
| if bw.concurrencyAutoScaler != nil { | ||
| if pErr := bw.concurrency.PollerPermit.Acquire(bw.limiterContext); pErr == nil { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this looks like a leaking abstraction. This should be handled inside concurrencyAutoScaler.
I suggest moving all
concurrencyAutoScaler != nil checks inside methods where it is required.
This code should be simpler. Just calling methods on autoscaler. If it is nil, do nothing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are guarding calls againts bw.concurrency based on nilness of bw.concurrencyAutoScaler which indicates that these two should be abstracted behind a single interface to avoid additional complexity in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've removed this check in all places. Regarding the comment hese two should be abstracted behind a single interface to avoid additional complexity in this file, I still think this is two separate entities. Client still needs concurrency whether autoscaler is enabled or not.
a9d3781    to
    52dd229      
    Compare
  
    1d733a5    to
    e7776be      
    Compare
  
            
          
                internal/internal_worker_base.go
              
                Outdated
          
        
      |  | ||
| if bw.pollerAutoScaler != nil { | ||
| if bw.concurrencyAutoScaler != nil { | ||
| if pErr := bw.concurrency.PollerPermit.Acquire(bw.limiterContext); pErr == nil { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are guarding calls againts bw.concurrency based on nilness of bw.concurrencyAutoScaler which indicates that these two should be abstracted behind a single interface to avoid additional complexity in this file
        
          
                internal/internal_worker_base.go
              
                Outdated
          
        
      | return t.autoConfigHint | ||
| default: | ||
| return nil | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of this switch case (which is not future proof), we can cast the task to autoConfigHintAwareTask interface and get the auto config hint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this to use autoConfigHintAwareTask
| lowerPollerWaitTime = 16 * time.Millisecond | ||
| upperPollerWaitTime = 256 * time.Millisecond | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like we would want to iterate on these to adjust sensitivity. consider exposing these to worker config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The poller wait time is an invariant. User doesn't need to tune it. The sensitivity control (time-to-react) is actually controlled by the Cooldown which is already in the parameter
| }, | ||
| }, | ||
| { | ||
| "idl pollers waiting for tasks", | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: typo idle. same in other cases below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| coverage failed due to deprecation changes | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropping notes for now, while reading tests carefully 👍
overall looks pretty good I think - fairly easy to follow, behavior looks good (e.g. up to 4x growth when "instant", 0.5x shrink when slow, one scale change every 10 seconds sounds reasonable), everything's pretty close.
so just a small pile of minor stuff, some nits some not.
| autoScalerEventStart autoScalerEvent = "auto-scaler-start" | ||
| autoScalerEventStop autoScalerEvent = "auto-scaler-stop" | ||
| autoScalerEventLogMsg string = "concurrency auto scaler event" | ||
| testTimeFormat string = "15:04:05" | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| testTimeFormat string = "15:04:05" | 
| "busy pollers, scale up to maximum", | ||
| []*shared.AutoConfigHint{ | ||
| {common.PtrOf(true), common.PtrOf(int64(0))}, // <- tick, in cool down | ||
| {common.PtrOf(true), common.PtrOf(int64(0))}, // <- tick, scale down to minimum | ||
| }, | ||
| []eventLog{ | ||
| {autoScalerEventStart, false, 100, "00:00:00"}, | ||
| {autoScalerEventEnable, true, 100, "00:00:00"}, | ||
| {autoScalerEventPollerSkipUpdateCooldown, true, 100, "00:00:01"}, | ||
| {autoScalerEventPollerScaleUp, true, 200, "00:00:02"}, | ||
| {autoScalerEventStop, true, 200, "00:00:02"}, | ||
| }, | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be easier to follow the actual behavior of this one with a less-than-1/2-maximum set of values, e.g. start with 10 rather than 100 -> it won't scale to maximum, it'll scale to 42.
kinda similar for others below, e.g. pollers, scale up and down multiple times becomes:
{autoScalerEventStart, false, 10, "00:00:00"},
{autoScalerEventEnable, true, 10, "00:00:00"},
{autoScalerEventPollerSkipUpdateCooldown, true, 10, "00:00:01"},
{autoScalerEventPollerScaleUp, true, 42, "00:00:02"},
{autoScalerEventPollerSkipUpdateCooldown, true, 42, "00:00:03"},
{autoScalerEventPollerScaleDown, true, 25, "00:00:04"},
{autoScalerEventPollerSkipUpdateCooldown, true, 25, "00:00:05"},
{autoScalerEventPollerScaleUp, true, 104, "00:00:06"},
{autoScalerEventPollerSkipUpdateCooldown, true, 104, "00:00:07"},
{autoScalerEventPollerScaleDown, true, 63, "00:00:08"},
{autoScalerEventStop, true, 63, "00:00:08"},
which seems a bit more informative than "to max, down, back to max, back to same down value"
…ep backward compatibility and fix test cases and other comments
eda2abf    to
    7895fd4      
    Compare
  
    
Detailed Description
Improve performance of poller auto scaler by using more accurate scaling signals and several implementation changes.
Changes
AutoScalerOptionsis introduced.Impact Analysis
Testing Plan
Rollout Plan
What is the rollout plan?
For Uber services, standard client release steps
For OSS users, turn off autoscaler feature first before the client upgrade.
Does the order of deployment matter? No
Is it safe to rollback? Does the order of rollback matter? Yes
Is there a kill switch to mitigate the impact immediately? Yes, the new autoscaler feature is an opt-in feature.