Commit 824bdef
committed
storeliveness: smear storeliveness heartbeats
Previously, heartbeat messages were sent immediately when enqueued via
`EnqueueMessage`. In large clusters, many stores might all send heartbeats
simultaneously at the same tick interval, causing a spike in runnable
goroutines that caused issues elsewhere in the database process.
This patch introduces heartbeat smearing logic that batches and smears
heartbeat sends over a configurable duration using a taskpacer. The
smearing logic is enabled by default via the
`kv.store_liveness.heartbeat_smearing.enabled` cluster setting
(defaults to true), and can be configured via
`kv.store_liveness.heartbeat_smearing.refresh` (default: 10ms) and
`kv.store_liveness.heartbeat_smearing.smear` (default: 1ms) settings.
When enabled, messages are enqueued but not sent immediately. Instead,
they wait in per-node queues until `SendAllEnqueuedMessages()` is called,
which signals the coordinator. The coordinator waits briefly
(`batchDuration`) to collect messages from multiple stores, then paces the
signaling to each queue's `processQueue` goroutine using the pacer to
spread sends over time.
Fixes: #148210
Release note: None1 parent 2b759bf commit 824bdef
File tree
6 files changed
+628
-88
lines changed- pkg
- kv/kvserver
- storeliveness
- server
- testutils/localtestcluster
6 files changed
+628
-88
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
248 | 248 | | |
249 | 249 | | |
250 | 250 | | |
251 | | - | |
| 251 | + | |
252 | 252 | | |
253 | 253 | | |
254 | 254 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
| 33 | + | |
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
36 | 37 | | |
| 38 | + | |
37 | 39 | | |
38 | 40 | | |
39 | 41 | | |
| |||
0 commit comments