Skip to content

Commit 8d555e8

Browse files
mattmattoxclaude
andcommitted
fix(network): raise overlay-test CPU limit and latency thresholds
The overlay-test pod's 10m CPU limit caused CFS throttling (88% of scheduling periods), inflating HTTP probe latency from ~3ms to 300-500ms on ARM/Pi nodes. This triggered false-positive NetworkDegraded alerts. - Bump overlay-test CPU: 1m/10m -> 5m/100m (eliminates CFS throttling) - Raise warning latency: 50ms -> 200ms (HTTP probing baseline on ARM) - Raise critical latency: 200ms -> 500ms Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 7dc491f commit 8d555e8

File tree

3 files changed

+6
-6
lines changed

3 files changed

+6
-6
lines changed

helm/node-doctor/templates/configmap.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,8 @@ data:
101101
probePath: /healthz
102102
pingCount: 3
103103
pingTimeout: 5s
104-
warningLatency: 50ms
105-
criticalLatency: 200ms
104+
warningLatency: 200ms
105+
criticalLatency: 500ms
106106
failureThreshold: 3
107107
minReachablePeers: 80
108108
cniHealth:

helm/node-doctor/values.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -400,10 +400,10 @@ overlayTest:
400400
# Resource limits for overlay test pods
401401
resources:
402402
requests:
403-
cpu: 1m
403+
cpu: 5m
404404
memory: 8Mi
405405
limits:
406-
cpu: 10m
406+
cpu: 100m
407407
memory: 16Mi
408408

409409
# Update strategy

pkg/monitors/network/cni.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,8 @@ const (
1616
// Default configuration values for CNI monitor
1717
defaultCNIPingCount = 3
1818
defaultCNIPingTimeout = 5 * time.Second
19-
defaultCNIWarningLatency = 50 * time.Millisecond
20-
defaultCNICriticalLatency = 200 * time.Millisecond
19+
defaultCNIWarningLatency = 200 * time.Millisecond
20+
defaultCNICriticalLatency = 500 * time.Millisecond
2121
defaultCNIFailureThreshold = 3
2222
defaultCNIMinReachablePeers = 80 // percentage
2323
defaultCNIRefreshInterval = 5 * time.Minute

0 commit comments

Comments
 (0)