Commit c67cbbe
Optimize seqlock and utilization watcher to prevent random 256MB allocation slowdowns
Root cause: Random 20x slowdowns (12.734ms vs 0.586ms) for 256MB allocations when
all 8 processes allocate simultaneously. Two issues:
1. Seqlock retry storm: When all 8 processes write to their slots, readers see
writers active (seqlock odd) and spin in tight loop, causing CPU contention.
2. Utilization watcher contention: The utilization_watcher thread held lock_shrreg()
during slow NVML queries (nvmlDeviceGetComputeRunningProcesses,
nvmlDeviceGetProcessUtilization), blocking shared memory operations.
Fixes:
1. Seqlock exponential backoff:
- Removed stale data fallback (memory checks require accurate data)
- Progressive delays: CPU pause → 1μs → 10μs → 100μs
- Prevents tight spinning while ensuring accurate reads
2. Utilization watcher optimization:
- Moved NVML queries OUTSIDE lock_shrreg()
- Lock now only held briefly to update shared memory
- Reduces lock hold time from milliseconds to microseconds
Impact: Should eliminate random 256MB allocation slowdowns by reducing
seqlock contention and utilization watcher blocking.
Signed-off-by: Nishit Shah <nish511@gmail.com>1 parent 46399f2 commit c67cbbe
File tree
2 files changed
+45
-31
lines changed- src/multiprocess
2 files changed
+45
-31
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
285 | 285 | | |
286 | 286 | | |
287 | 287 | | |
288 | | - | |
289 | 288 | | |
290 | 289 | | |
| 290 | + | |
291 | 291 | | |
292 | 292 | | |
293 | 293 | | |
294 | 294 | | |
295 | | - | |
| 295 | + | |
296 | 296 | | |
297 | | - | |
298 | | - | |
299 | | - | |
300 | | - | |
301 | | - | |
302 | | - | |
303 | | - | |
304 | | - | |
305 | | - | |
306 | | - | |
307 | | - | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
308 | 318 | | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
309 | 322 | | |
310 | 323 | | |
311 | 324 | | |
| |||
326 | 339 | | |
327 | 340 | | |
328 | 341 | | |
329 | | - | |
330 | | - | |
331 | | - | |
332 | | - | |
333 | | - | |
334 | | - | |
335 | | - | |
336 | | - | |
337 | | - | |
338 | 342 | | |
339 | 343 | | |
340 | 344 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
128 | 128 | | |
129 | 129 | | |
130 | 130 | | |
131 | | - | |
132 | 131 | | |
133 | 132 | | |
134 | 133 | | |
| |||
142 | 141 | | |
143 | 142 | | |
144 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
145 | 147 | | |
146 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
147 | 160 | | |
148 | 161 | | |
149 | 162 | | |
| |||
152 | 165 | | |
153 | 166 | | |
154 | 167 | | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
| 168 | + | |
| 169 | + | |
162 | 170 | | |
163 | 171 | | |
164 | 172 | | |
| |||
167 | 175 | | |
168 | 176 | | |
169 | 177 | | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
170 | 181 | | |
171 | 182 | | |
172 | 183 | | |
173 | 184 | | |
174 | | - | |
175 | 185 | | |
176 | 186 | | |
177 | 187 | | |
| |||
0 commit comments