Estimate scalability coefficient from past scaling history using linear regression #1

pchoudhury22 · 2025-04-30T15:27:04Z

What is the purpose of the change

Currently, target parallelism computation assumes perfect linear scaling. However, real-time workloads often exhibit nonlinear scalability due to factors like network overhead and coordination costs.

This change introduces an observed scalability coefficient, estimated using linear regression on past (parallelism, processing rate) data, to improve the accuracy of scaling decisions.

Brief change log

Implemented a dynamic scaling coefficient to compute target parallelism based on observed scalability. The system estimates the scalability coefficient using a least squares linear regression approach, leveraging historical (parallelism, processing rate) data.
The regression model minimises the sum of squared errors. The baseline processing rate is computed using the smallest observed parallelism in the history. Model details:

The Linear Model

We define a linear relationship between parallelism (P) and processing rate (R):

$$R_i = β * P_i * α$$

where:

R_i = actual processing rate for the i-th data point
P_i = parallelism for the i-th data point
β = base factor (constant scale factor)
α = scaling coefficient to optimize

Squared Error

The loss function to minimise is the sum of squared errors (SSE):

$$Loss = Σ (R_i - R̂_i)^2$$

Substituting ( R̂_i = (β α) P_i ):

$$Loss = Σ (R_i - β α P_i)^2$$

Minimising the Error

Expanding ( (R_i - β α P_i)^2 ):

$$(R_i - β α P_i)^2 = R_i^2 - 2β α P_i R_i + (β α P_i)^2$$

Summing over all data points:

$$Loss = Σ (R_i^2 - 2β α P_i R_i + β^2 α^2 P_i^2)$$

Solving for α

To minimize for α, taking the derivative and solving we get:

$$α = (Σ P_i R_i) / (Σ P_i^2 * β)$$

Verifying this change

New unit tests added to cover this

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changes to the CustomResourceDescriptors: no
Core observer or reconciler logic that is regularly executed: no

You removed so much material I had to do it myself. can you confirm above I have accurately removed all traces of weighting

…ory using linear regression

… 2. Check scaling coefficient with threshold before returning. 3. Refactored tests for point [1] and [2].

pchoudhury22 · 2025-04-30T15:34:22Z

[test] run all

…idator to validate the min scaling coefficient config.

pchoudhury22 added 3 commits April 2, 2025 17:46

[FLINK-30571] Estimate scalability coefficient from past scaling hist…

6f01454

…ory using linear regression

1. Updated scaling coefficient compute logic to remove the weighting.…

ee9a953

… 2. Check scaling coefficient with threshold before returning. 3. Refactored tests for point [1] and [2].

1. Clamped lowerBound for scaling coefficient to 0.5

6385389

pchoudhury22 added 2 commits May 1, 2025 11:58

1. Adding config for min value of Scaling coefficient. 2. Updated val…

3adc6b3

…idator to validate the min scaling coefficient config.

Updating auto scaler configuration doc

dcd815b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Estimate scalability coefficient from past scaling history using linear regression #1

Estimate scalability coefficient from past scaling history using linear regression #1

pchoudhury22 commented Apr 30, 2025

Uh oh!

pchoudhury22 commented Apr 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Estimate scalability coefficient from past scaling history using linear regression #1

Are you sure you want to change the base?

Estimate scalability coefficient from past scaling history using linear regression #1

Conversation

pchoudhury22 commented Apr 30, 2025

What is the purpose of the change

Brief change log

The Linear Model

Squared Error

Minimising the Error

Solving for α

Verifying this change

Does this pull request potentially affect one of the following parts:

Uh oh!

pchoudhury22 commented Apr 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants