Skip to content

Missing scale due to possible miscalculation of metrics #1052

@KevinJCross

Description

@KevinJCross

While looking at the failing tests ... it seems there is another issue besides dropping of metric values and metrics itself. It seems that we can calculate them wrong for the aggregated version too.
for a single instance:

  name:              autoscaler-1-dynamic-policy-64a6916ee2a9c283
  requested state:   started
  routes:            autoscaler-1-dynamic-policy-64a6916ee2a9c283.autoscaler.app-runtime-interfaces.ci.cloudfoundry.org
  last uploaded:     Tue 15 Nov 09:44:17 GMT 2022
  stack:             cflinuxfs3
  buildpacks:        
  	name               version   detect output   buildpack name
  	nodejs_buildpack   1.8.1     nodejs          nodejs

  type:           web
  sidecars:       
  instances:      1/1
  memory usage:   128M
       state     since                  cpu    memory          disk             logging            details
  #0   running   2022-11-15T09:44:33Z   0.5%!M(MISSING) of 128M   166.6M of 200M   0/s of unlimited   
 we get the correct metric for the first result then we seem to think `there are 2 instance and halve the same result` ?
Ive seen this happen intermittently on the failing acceptance tests.
  cf autoscaling-metrics autoscaler-1-dynamic-policy-64a6916ee2a9c283 responsetime: 
  Retrieving aggregated responsetime metrics for app autoscaler-1-dynamic-policy-64a6916ee2a9c283...
  Metrics Name     	Value      	Timestamp     	
  responsetime     	1507ms     	2022-11-15T09:49:16Z     	
  responsetime     	1507ms     	2022-11-15T09:48:36Z     	
  responsetime     	1506ms     	2022-11-15T09:47:56Z     	
  responsetime     	3011ms     	2022-11-15T09:47:16Z     	
  responsetime     	           	2022-11-15T09:46:36Z     	
  responsetime     	           	2022-11-15T09:45:56Z   

NOTE:

  • We have missing metrics the bottom 2.
  • Then we have one correct metric at 2022-11-15T09:47:16Z
  • Then we have 3 metrics with exactly half of the 3000ms (1506) response time from our app (or app is correctly doing this). This seems to suggest that the autoscaler thinks there are 2 instances?

This was noticed in the acceptance test run on the test app for a 3s response time did not scale.

https://concourse.app-runtime-interfaces.ci.cloudfoundry.org/teams/app-autoscaler/pipelines/app-autoscaler-release/jobs/acceptance/builds/45#L6[…]78

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions