Skip to content

Commit 21cf1ff

Browse files
author
Shlomi Noach
committed
updated documentation to match --replication-lag-query deprecation
1 parent c66356b commit 21cf1ff

File tree

3 files changed

+7
-30
lines changed

3 files changed

+7
-30
lines changed

doc/command-line-flags.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -100,10 +100,7 @@ On a replication topology, this is perhaps the most important migration throttli
100100

101101
When using [Connect to replica, migrate on master](cheatsheet.md), this lag is primarily tested on the very replica `gh-ost` operates on. Lag is measured by checking the heartbeat events injected by `gh-ost` itself on the utility changelog table. That is, to measure this replica's lag, `gh-ost` doesn't need to issue `show slave status` nor have any external heartbeat mechanism.
102102

103-
When `--throttle-control-replicas` is provided, throttling also considers lag on specified hosts. Measuring lag on these hosts works as follows:
104-
105-
- If `--replication-lag-query` is provided, use the query, trust its result to indicate lag seconds (fraction, i.e. float, allowed)
106-
- Otherwise, issue `show slave status` and read `Seconds_behind_master` (`1sec` granularity)
103+
When `--throttle-control-replicas` is provided, throttling also considers lag on specified hosts. Lag measurements on listed hosts is done by querying `gh-ost`'s _changelog_ table, where `gh-ost` injects a heartbeat.
107104

108105
See also: [Sub-second replication lag throttling](subsecond-lag.md)
109106

doc/subsecond-lag.md

Lines changed: 5 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,20 @@
22

33
`gh-ost` is able to utilize sub-second replication lag measurements.
44

5-
At GitHub, small replication lag is crucial, and we like to keep it below `1s` at all times. If you have similar concern, we strongly urge you to proceed to implement sub-second lag throttling.
5+
At GitHub, small replication lag is crucial, and we like to keep it below `1s` at all times.
66

77
`gh-ost` will do sub-second throttling when `--max-lag-millis` is smaller than `1000`, i.e. smaller than `1sec`.
88
Replication lag is measured on:
99

1010
- The "inspected" server (the server `gh-ost` connects to; replica is desired but not mandatory)
1111
- The `throttle-control-replicas` list
1212

13-
For the inspected server, `gh-ost` uses an internal heartbeat mechanism. It injects heartbeat events onto the utility changelog table, then reads those events in the binary log, and compares times. This measurement is by default and by definition sub-second enabled.
13+
`gh-ost` uses an internal heartbeat mechanism. It injects heartbeat events onto the utility changelog table, then reads those events in the binary log, and compares times. This measurement is on by default and by definition supports sub-second resolution.
1414

1515
You can explicitly define how frequently will `gh-ost` inject heartbeat events, via `heartbeat-interval-millis`. You should set `heartbeat-interval-millis <= max-lag-millis`. It still works if not, but loses granularity and effect.
1616

17-
On the `throttle-control-replicas`, `gh-ost` only issues SQL queries, and does not attempt to read the binary log stream. Perhaps those other replicas don't have binary logs in the first place.
17+
`gh-ost` will use its own internal heartbeat to measure lag both on the inspected replica as well as on the `--throttle-control-replicas` list, if provided.
1818

19-
The standard way of getting replication lag on a replica is to issue `SHOW SLAVE STATUS`, then reading `Seconds_behind_master` value. But that value has a `1sec` granularity.
19+
In earlier versions, the `--throttle-control-replicas` list was subjected to `1` second resolution or to 3rd party heartbeat injections such as `pt-heartbeat`. This is no longer the case. The argument `--replication-lag-query` has been deprecated and no longer needed.
2020

21-
To be able to throttle on your production replicas fleet when replication lag exceeds a sub-second threshold, you must provide with a `replication-lag-query` that returns a sub-second resolution lag.
22-
23-
As a common example, many use [pt-heartbeat](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html) to inject heartbeat events on the master. You would issue something like:
24-
25-
/usr/bin/pt-heartbeat -- -D your_schema --create-table --update --replace --interval=0.1 --daemonize --pid ...
26-
27-
Note `--interval=0.1` to indicate `10` heartbeats per second.
28-
29-
You would then provide
30-
31-
gh-ost ... --replication-lag-query="select unix_timestamp(now(6)) - unix_timestamp(ts) as ghost_lag_check from your_schema.heartbeat order by ts desc limit 1"
32-
33-
Our production migrations use sub-second lag throttling and are able to keep our entire fleet of replicas well below `1sec` lag.
21+
Our production migrations use sub-second lag throttling and are able to keep our entire fleet of replicas well below `1sec` lag. We use `--heartbeat-interval-millis=100` on our production migrations with a `--mas-lag-millis` value of between `300` and `500`.

doc/throttle.md

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,15 +28,7 @@ Otherwise you may specify your own list of replica servers you wish it to observ
2828

2929
- `--max-lag-millis`: maximum allowed lag; any controlled replica lagging more than this value will cause throttling to kick in. When all control replicas have smaller lag than indicated, operation resumes.
3030

31-
- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single decimal (floating point) value indicating replication lag.
32-
33-
Example: `--replication-lag-query="SELECT UNIX_TIMESTAMP() - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"`
34-
35-
We encourage you to use [sub-second replication lag throttling](subsecond-lag.md). Your query may then look like:
36-
37-
`--replication-lag-query="SELECT UNIX_TIMESTAMP(6) - MAX(UNIX_TIMESTAMP(ts)) AS lag FROM mydb.heartbeat"`
38-
39-
Note that you may dynamically change both `replication-lag-query` and the `throttle-control-replicas` list via [interactive commands](interactive-commands.md)
31+
Note that you may dynamically change both `--max-lag-millis` and the `throttle-control-replicas` list via [interactive commands](interactive-commands.md)
4032

4133
#### Status thresholds
4234

0 commit comments

Comments
 (0)