You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/command-line-flags.md
+10Lines changed: 10 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,6 +43,16 @@ password=123456
43
43
44
44
See `exact-rowcount`
45
45
46
+
### critical-load-interval-millis
47
+
48
+
`--critical-load` defines a threshold that, when met, `gh-ost` panics and bails out. The default behavior is to bail out immediately when meeting this threshold.
49
+
50
+
This may sometimes lead to migrations bailing out on a very short spike, that, while in itself is impacting production and is worth investigating, isn't reason enough to kill a 10 hour migration.
51
+
52
+
When `--critical-load-interval-millis` is specified (e.g. `--critical-load-interval-millis=2500`), `gh-ost` gives a second chance: when it meets `critical-load` threshold, it doesn't bail out. Instead, it starts a timer (in this example: `2.5` seconds) and re-checks `critical-load` when the timer expires. If `critical-load` is met again, `gh-ost` panics and bails out. If not, execution continues.
53
+
54
+
This is somewhat similar to a Nagios `n`-times test, where `n` in our case is always `2`.
55
+
46
56
### cut-over
47
57
48
58
Optional. Default is `safe`. See more discussion in [cut-over](cut-over.md)
Copy file name to clipboardExpand all lines: go/cmd/gh-ost/main.go
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -99,6 +99,7 @@ func main() {
99
99
100
100
maxLoad:=flag.String("max-load", "", "Comma delimited status-name=threshold. e.g: 'Threads_running=100,Threads_connected=500'. When status exceeds threshold, app throttles writes")
101
101
criticalLoad:=flag.String("critical-load", "", "Comma delimited status-name=threshold, same format as `--max-load`. When status exceeds threshold, app panics and quits")
102
+
flag.Int64Var(&migrationContext.CriticalLoadIntervalMilliseconds, "critical-load-interval-millis", 0, "When 0, migration bails out upon meeting critical-load immediately. When non-zero, a second check is done after given interval, and migration only bails out if 2nd check still meets critical load")
log.Errorf("critical-load met once: %s=%d, >=%d. Will check again in %d millis", variableName, value, threshold, this.migrationContext.CriticalLoadIntervalMilliseconds)
this.migrationContext.PanicAbort<-fmt.Errorf("critical-load met again after %d millis: %s=%d, >=%d", this.migrationContext.CriticalLoadIntervalMilliseconds, variableName, value, threshold)
0 commit comments