You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MB-29928: Implement auto controller logic for the defragmenter
With changes in 7.0 to memory tracking, we now have visibility of
an individual bucket's fragmentation, whereas pre 7.0 we only had
visibility of the entire process.
This commit makes use of the bucket fragmentation to calculate
the sleep interval of the defragger, the overall idea being that
as a bucket's defragmentation gets worse, the sleep time reduces.
The defragger is then running more frequently, visiting more items
and bringing the fragmentation down.
The commit introduces two new modes of automatic calculation. The reason
for this is that the second, PID mode, is more experimental. Ultimately
once it's had some soak time, one mode can remain in code.
The two modes are as follows and can be selected in the bucket config
(a future patch makes them runtime switchable via cbepctl).
1) auto - Use a 'static' and predictable calculation for converting
fragmentation into a reduction in sleep time.
2) auto_pid - Use a PID controller to calculate reductions in
fragmentation. This is less predictable as real time is a factor
in the calculation, scheduling delays etc... results in unpredictable
outputs.
The existing mode (just use defragmenter_interval) is named "static".
Both modes of auto controller work by taking the bucket fragmentation
as a percentage and then using the bucket's low-water mark creating
a 'score' which is then used for determining how the sleep interval
maybe calculated. The result is that when fragmentation maybe high,
but rss is actually small (lots of headroom before low-water mark)
the score is low, whilst as we approach the low-water mark the score
increases.
E.g.
fragmentation 23% (allocated:500, rss:650), then with a low-water
mark of n the value used in calculations (score):
n | score
600 | 23 (rss > low-water)
1000 | 14.95
2000 | 7.4
3000 | 4.98
5000 | 2.99
A spreadsheet with numerous scenarios and the score can be found here:
https://docs.google.com/spreadsheets/d/1W72N2vbrfa5xOVFmS0e3tpFCcEyd8kPk8fqMNmuM1k8/edit#gid=0
auto: This mode takes the score and a range. Below the range
and the maximum sleep is used, above the range and the minimum sleep is
used. When the score is within the range we find how far in the range
the score is, e.g. 20% and map that to be 20% between min and max sleep.
Here the following configuration parameters are being used:
* defragmenter_auto_min_sleep 0.0
* defragmenter_auto_max_sleep 10.0
* defragmenter_auto_lower_threshold 0.07
* defragmenter_auto_upper_threshold 0.25
auto_pid: This mode uses a single configurable threshold and when the
score exceeds that threshold the PID calculates an output. The returned
sleep time is the maximum - output, but capped at the configuration
minimum. The PID itself is configured at runtime and the commit uses
values for P, I, D and dt based on examination of the "pathogen"
performance test and use of the `pid_runner` program which allows for
some examination of P, I and D. The assumption is that fragmentation
doesn't increase quickly, hence the I and dt term forces the PID to
only recalculate every 10 seconds with a 'slow' output.
Here the following configuration parameters are being used:
* defragmenter_auto_min_sleep 0.0
* defragmenter_auto_max_sleep 10.0
* defragmenter_auto_lower_threshold 0.07
* defragmenter_auto_pid_p 0.3
* defragmenter_auto_pid_i 0.0000197
* defragmenter_auto_pid_d 0.0
* defragmenter_auto_pid_dt 30000
These values have been used in the pid_runner test and were chosen based
on the observation that fragmentation in real workloads increases
slowly. The pathogen test is useful for testing defragmentation, but
may not be truly representative of real fragmentation growth, for
example that test achieves fragmentation greater than 35% in a very
short time, but is operating on a small amount of data, mem_used
ranges from ~200MB to ~600MB.
First dt: With the observation that fragmentation generally increases
slowly The dt term controls the rate at which the PID reads the Process
Variable (PV or in our case scored fragmentation) and reacts. Thus 30
seconds will elapse before the PID computes a new output value. If the
PV were changing at faster rates, the dt term would be reduced.
P I D values:
Using pid_runner (in its committed state) a number of scenarios were
compared where the PV is at a fixed percentage above the SP. These
scenarios guided the current values of P I and D.
For example when the PV is 1.1x of SP it would take the PID ~20 hours to
reduce the sleep interval to min (0.0).
When the PV is 2.6x of SP it would take the PID 75 minutes to reduce the
sleep interval to min (0.0).
PV x | time to min sleep
1.1 | 20h:8m:31s
1.2 | 10h:4m:31s
1.5 | 4h:1m:31s
1.8 | 2h:31m:1s
2.0 | 2h:1m:1s
2.3 | 1h:33m:1s
2.6 | 1h:15m:31s
2.9 | 1h:3m:31s
3.0 | 1h:0m:31s
3.3 | 0h:52m:31s
3.5 | 0h:48m:31s
A final note on the use of a PID. Typical use of a PID would be in
systems where the 'process variable' can be influenced in positive and
negative ways. E.g. a temperature could be controlled by heating or not
heating (or forced cooling). In our use-case we can influence
fragmentation down (by running the defragger), but we cannot raise
fragmentation to the set-point. i.e. our use of a PID cannot maintain
a level of fragmentation. This is why in the code, once the
fragmentation (score) drops below the lower threshold, the PID just
resets and the max sleep is used.
Change-Id: Ia67d789dc38e0c649d2e7cf8cea945f8f67b711e
Reviewed-on: http://review.couchbase.org/c/kv_engine/+/155961
Tested-by: Build Bot <[email protected]>
Reviewed-by: Dave Rigby <[email protected]>
Copy file name to clipboardExpand all lines: engines/ep/configuration.json
+64-4Lines changed: 64 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -315,10 +315,9 @@
315
315
"dcp_noop_mandatory_for_v5_features": {
316
316
"default": "true",
317
317
"descr": "Forces clients to enable noop for v5 features",
318
-
"dynamic": true,
318
+
"dynamic": true,
319
319
"type": "bool"
320
320
},
321
-
322
321
"defragmenter_enabled": {
323
322
"default": "true",
324
323
"descr": "True if defragmenter task is enabled",
@@ -328,7 +327,7 @@
328
327
"defragmenter_interval": {
329
328
"default": "10.0",
330
329
"descr": "How often defragmenter task should be run (in seconds).",
331
-
"dynamic": true,
330
+
"dynamic": true,
332
331
"type": "float"
333
332
},
334
333
"defragmenter_age_threshold": {
@@ -346,14 +345,75 @@
346
345
"defragmenter_chunk_duration": {
347
346
"default": "20",
348
347
"descr": "Maximum time (in ms) defragmentation task will run for before being paused (and resumed at the next defragmenter_interval).",
349
-
"dynamic": true,
348
+
"dynamic": true,
350
349
"type": "size_t",
351
350
"validator": {
352
351
"range": {
353
352
"min": 1
354
353
}
355
354
}
356
355
},
356
+
"defragmenter_mode" : {
357
+
"default": "auto_pid",
358
+
"descr": "Determines how the defragmenter controls its sleep interval. When static defragmenter_interval is used. When auto_linear, scale the sleep time using a scored defragmentation when it falls between defragmenter_auto_lower_trigger and defragmenter_auto_upper_trigger. When auto_pid use a PID controller to computer reductions in the sleep interval when scored fragmentation is above defragmenter_auto_lower_trigger.",
359
+
"dynamic": false,
360
+
"type": "std::string",
361
+
"validator": {
362
+
"enum": [
363
+
"static",
364
+
"auto_linear",
365
+
"auto_pid"
366
+
]
367
+
}
368
+
},
369
+
"defragmenter_auto_lower_threshold" : {
370
+
"default": "0.07",
371
+
"descr": "When mode is not static and scored fragmentation is above this value, a sleep time between defragmenter_auto_min_sleep and defragmenter_auto_max_sleep will be used",
372
+
"dynamic": false,
373
+
"type": "float"
374
+
},
375
+
"defragmenter_auto_upper_threshold" : {
376
+
"default": "0.25",
377
+
"descr": "When mode is auto_linear and scored fragmentation is above this value, the defragmenter will use defragmenter_auto_min_sleep",
378
+
"dynamic": false,
379
+
"type": "float"
380
+
},
381
+
"defragmenter_auto_max_sleep" : {
382
+
"default": "10.0",
383
+
"descr": "The maximum sleep that the auto controller can set",
384
+
"dynamic": false,
385
+
"type": "float"
386
+
},
387
+
"defragmenter_auto_min_sleep" : {
388
+
"default": "0.0",
389
+
"descr": "The minimum sleep that the auto controller can set",
390
+
"dynamic": false,
391
+
"type": "float"
392
+
},
393
+
"defragmenter_auto_pid_p" : {
394
+
"default": "0.3",
395
+
"descr": "The p term for the PID controller",
396
+
"dynamic": false,
397
+
"type": "float"
398
+
},
399
+
"defragmenter_auto_pid_i" : {
400
+
"default": "0.0000197",
401
+
"descr": "The i term for the PID controller",
402
+
"dynamic": false,
403
+
"type": "float"
404
+
},
405
+
"defragmenter_auto_pid_d" : {
406
+
"default": "0.0",
407
+
"descr": "The d term for the PID controller",
408
+
"dynamic": false,
409
+
"type": "float"
410
+
},
411
+
"defragmenter_auto_pid_dt" : {
412
+
"default": "30000",
413
+
"descr": "The dt (interval) term for the PID controller. Value represents milliseconds",
414
+
"dynamic": false,
415
+
"type": "size_t"
416
+
},
357
417
"durability_timeout_task_interval": {
358
418
"default": "25",
359
419
"descr": "Interval (in ms) between subsequent runs of the DurabilityTimeoutTask",
0 commit comments