Skip to content

Commit 7afa840

Browse files
committed
Snapshot with tracking code.
1 parent 05e1758 commit 7afa840

File tree

1 file changed

+19
-23
lines changed

1 file changed

+19
-23
lines changed

cas_client/src/adaptive_concurrency/controller.rs

Lines changed: 19 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -44,32 +44,28 @@ impl ConcurrencyControllerState {
4444
}
4545
}
4646

47-
/// A controller for robustly adjusting the amount of concurrancy on upload and download paths.
47+
/// A controller for dynamically adjusting the amount of concurrancy on upload and download paths.
4848
///
49-
/// By default, the controller dynamically adjusts the concurrency within bounds so that between 70%
50-
/// and 90% of the transfers are completed within 20 seconds; it increases the concurrency as long as
51-
/// this criteria is met. When more than 20% of the transfers begin taking longer than that, concurrency
52-
/// is reduced. Concurrency adjustments are throttled so that increasing the concurrency happens only every
53-
/// 500ms, and decreasing it happens at most once every 250ms. (These values are all defaults; see
54-
/// the constants and their definitions in constants.rs).
55-
///
56-
/// More formally:
57-
///
58-
/// A "success" is a transfer that completed successfully within a specified amount of time that
59-
/// is determined by the size of the transfer.
60-
/// - For 64MB uploads and downloads, this is defined as completion within
61-
/// CONCURRENCY_CONTROL_TARGET_TIME_LARGE_TRANSFER_MS.
62-
/// - For 0B transfers, this is defined as CONCURRENCY_CONTROL_TARGET_TIME_SMALL_TRANSFER_MS.
63-
/// - The expected time is scaled linearly between these two endpoints based on size.
49+
/// This controller uses two statistical models that adapt over time using exponentially weighted
50+
/// moving averages. The first is a model that predicts the overall current bandwith, and the second is
51+
/// a model of how many transfers complete within the predicted time.
52+
///
53+
/// The key insight is this:
54+
/// 1. When a network connection is underutilized, the latency scales sublinearly with the number of parallel
55+
/// connections. In other words, adding another transfer does not affect the speed of the other transfers
56+
/// significantly.
57+
/// 2. When a network connection is fully utilized, then the latency scales linearly with the concurrency. In other
58+
/// words, adding increasing the concurrency from N to N+1 would cause the latency of all the other transfers to
59+
/// increase by a factor of (N+1) / N.
60+
/// 3. When a network connection is oversaturated, then the latency scales superlinearly -- in other words, adding an
61+
/// additional connection causes the overall throughput to decrease.
62+
///
63+
/// Now, because latency is a noisy observation, we track a running clipped average of the deviance between
64+
/// predicted time and the actual time, and increase the concurrency when this is reliably sublinear and decrease it
65+
/// when it is superlinear. This is clipped to avoid having a single observation weight it too much; failures
66+
/// and retries max out the deviance.
6467
///
65-
/// The last CONCURRENCY_CONTROL_TRACKING_SIZE successess or failures are tracked to estimate the
66-
/// success_ratio, and only events within CONCURRENCY_CONTROL_TRACKING_WINDOW_MS are considered. A
67-
/// retry attempt is counted as a failure.
6868
///
69-
/// When a transfer is completed, the concurrency is updated based on the recent success_ratio and
70-
/// whether that transfer was a success. However, increases are made at most once every
71-
/// CONCURRENCY_CONTROL_MIN_INCREASE_WINDOW_MS and decreases at most every
72-
/// CONCURRENCY_CONTROL_MIN_DECREASE_WINDOW_MS.
7369
pub struct AdaptiveConcurrencyController {
7470
// The current state, including tracking information and when previous adjustments were made.
7571
// Also holds related constants

0 commit comments

Comments
 (0)