@@ -1133,7 +1133,7 @@ arrive to that queue and crowd out other queues for an arbitrarily
1133
1133
long time. To mitigate this problem, the implementation has a special
1134
1134
step that effectively prevents ` t_dispatch_virtual ` of the next
1135
1135
request to dispatch from dropping below the current time. But that
1136
- solves only half of the problem. Other queueus may accumulate a
1136
+ solves only half of the problem. Other queues may accumulate a
1137
1137
corresponding deficit (inappropriately large values for
1138
1138
` t_dispatch_virtual ` and ` t_finish_virtual ` ). Such a queue can have
1139
1139
an arbitrarily long burst of inappropriate lossage to other queues.
@@ -1201,7 +1201,7 @@ of available resources), a single request should consume no more than A
1201
1201
concurrency units. Fortunately that all compiles together because the
1202
1202
` processing latency ` of the LIST request is actually proportional to the
1203
1203
number of processed objects, so the cost of the request (defined above as
1204
- ` <width> x <processing latency> ` really is proportaional to the number of
1204
+ ` <width> x <processing latency> ` really is proportional to the number of
1205
1205
processed objects as expected.
1206
1206
1207
1207
For RAM the situation is actually different. In order to process a LIST
@@ -1220,7 +1220,7 @@ where N is the number of items a given LIST request is processing.
1220
1220
1221
1221
The question is how to combine them to a single number. While the main goal
1222
1222
is to stay on the safe side and protect from the overload, we also want to
1223
- maxiumize the utilization of the available concurrency units.
1223
+ maximize the utilization of the available concurrency units.
1224
1224
Fortunately, when we normalize CPU and RAM to percentage of available capacity,
1225
1225
it appears that almost all requests are much more cpu-intensive. Assuming
1226
1226
4GB:1CPU ratio and 10kB average object and the fact that processing larger
@@ -1234,7 +1234,7 @@ independently, which translates to the following function:
1234
1234
```
1235
1235
We're going to better tune the function based on experiments, but based on the
1236
1236
above back-of-envelope calculations showing that memory should almost never be
1237
- a limiting factor we will apprximate the width simply with:
1237
+ a limiting factor we will approximate the width simply with:
1238
1238
```
1239
1239
width_approx(n) = min(A, ceil(N / E)), where E = 1 / B
1240
1240
```
@@ -1267,7 +1267,7 @@ the virtual world for `additional latency`.
1267
1267
Adjusting virtual time of a queue to do that is trivial. The other thing
1268
1268
to tweak is to ensure that the concurrency units will not get available
1269
1269
for other requests for that time (because currently all actions are
1270
- triggerred by starting or finishing some request). We will maintain that
1270
+ triggered by starting or finishing some request). We will maintain that
1271
1271
possibility by wrapping the handler into another one that will be sleeping
1272
1272
for ` additional latence ` after the request is processed.
1273
1273
@@ -1284,7 +1284,7 @@ requests. Now in order to start processing a request, it has to accumulate
1284
1284
The important requirement to recast now is fairness. As soon a single
1285
1285
request can consume more units of concurrency, the fairness is
1286
1286
no longer about the number of requests from a given queue, but rather
1287
- about number of consumed concurrency units. This justifes the above
1287
+ about number of consumed concurrency units. This justifies the above
1288
1288
definition of adjusting the cost of the request to now be equal to
1289
1289
` <width> x <processing latency> ` (instead of just ` <processing latency> ` ).
1290
1290
@@ -1310,7 +1310,7 @@ modification to the current dispatching algorithm:
1310
1310
semantics of virtual time tracked by the queues to correspond to work,
1311
1311
instead of just wall time. That means when we estimate a request's
1312
1312
virtual duration, we will use ` estimated width x estimated latency ` instead
1313
- of just estimated latecy . And when a request finishes, we will update
1313
+ of just estimated latency . And when a request finishes, we will update
1314
1314
the virtual time for it with ` seats x actual latency ` (note that seats
1315
1315
will always equal the estimated width, since we have no way to figure out
1316
1316
if a request used less concurrency than we granted it).
@@ -1348,7 +1348,7 @@ We will solve this problem by also handling watch requests by our priority
1348
1348
and fairness kube-apiserver filter. The queueing and admitting of watch
1349
1349
requests will be happening exactly the same as for all non-longrunning
1350
1350
requests. However, as soon as watch is initialized, it will be sending
1351
- an articial ` finished ` signal to the APF dispatcher - after receiving this
1351
+ an artificial ` finished ` signal to the APF dispatcher - after receiving this
1352
1352
signal dispatcher will be treating the request as already finished (i.e.
1353
1353
the concurrency units it was occupying will be released and new requests
1354
1354
may potentially be immediately admitted), even though the request itself
@@ -1362,7 +1362,7 @@ The first question to answer is how we will know that watch initialization
1362
1362
has actually been done. However, the answer for this question is different
1363
1363
depending on whether the watchcache is on or off.
1364
1364
1365
- In watchcache, the initialization phase is clearly separated - we explicily
1365
+ In watchcache, the initialization phase is clearly separated - we explicitly
1366
1366
compute ` init events ` and process them. What we don't control at this level
1367
1367
is the process of serialization and sending out the events.
1368
1368
In the initial version we will ignore this and simply send the `initialization
@@ -1395,7 +1395,7 @@ LIST requests) adjust the `width` of the request. However, in the initial
1395
1395
version, we will just use ` width=1 ` for all watch requests. In the future,
1396
1396
we are going to evolve it towards a function that will be better estimating
1397
1397
the actual cost (potentially somewhat similarly to how LIST requests are done)
1398
- but we first need to a machanism to allow us experiment and tune it better.
1398
+ but we first need to a mechanism to allow us experiment and tune it better.
1399
1399
1400
1400
#### Keeping the watch up-to-date
1401
1401
@@ -1425,7 +1425,7 @@ Let's start with an assumption that sending every watch event is equally
1425
1425
expensive. We will discuss how to generalize it below.
1426
1426
1427
1427
With the above assumption, a cost of a mutating request associated with
1428
- sending watch events triggerred by it is proportional to the number of
1428
+ sending watch events triggered by it is proportional to the number of
1429
1429
watchers that has to process that event. So let's describe how we can
1430
1430
estimate this number.
1431
1431
@@ -1468,7 +1468,7 @@ structure to avoid loosing too much information.
1468
1468
If we would have a hashing function that can combine only a similar buckets
1469
1469
(e.g. it won't combine "all Endpoints" bucket with "pods from node X") then
1470
1470
we can simply write maximum from all entries that are hashed to the same value.
1471
- This means that some costs may be overestimated, but if we resaonably hash
1471
+ This means that some costs may be overestimated, but if we reasonably hash
1472
1472
requests originating by system components, that seems acceptable.
1473
1473
The above can be achieved by hashing each resource type to a separate set of
1474
1474
buckets, and within a resource type hashing (namespace, name) as simple as:
@@ -1489,7 +1489,7 @@ as whenever something quickly grows we report it, but we don't immediately
1489
1489
downscale which is a way to somehow incorporate a history.
1490
1490
1491
1491
However, we will treat the above as a feasibility proof. We will just start
1492
- with the simplest apprach of treating each kube-apiserver independently.
1492
+ with the simplest approach of treating each kube-apiserver independently.
1493
1493
We will implement the above (i.e. knowledge sharing between kube-apiserver),
1494
1494
if the independence assumption will not work good enough.
1495
1495
The above description shows that it won't result in almost any wasted work
0 commit comments