Skip to content

Commit 3cb8061

Browse files
author
Shlomi Noach
authored
Merge pull request #54 from github/doc-throttle
documentation additions; cut-over flag
2 parents 7573c12 + a6c21dc commit 3cb8061

File tree

7 files changed

+113
-8
lines changed

7 files changed

+113
-8
lines changed

README.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,22 @@
1212

1313
WORK IN PROGRESS
1414

15-
Please meanwhile refer to the [docs](doc) for more information. No, really, go to the [docs](doc).
15+
Please meanwhile refer to the [docs](doc) for more information. No, really, go to the [docs](doc).
1616

1717
## Usage
1818

19+
#### Where to execute
20+
21+
The recommended way of executing `gh-ost` is to have it connect to a _replica_, as opposed to having it connect to the master. `gh-ost` will crawl its way up the replication chain to figure out who the master is.
22+
23+
By connecting to a replica, `gh-ost` sets up a self-throttling mechanism; feels more comfortable in querying `information_schema` tables; and more. Connecting `gh-ost` to a replica is also the trick to make it work even if your master is configured with `statement based replication`, as `gh-ost` is able to manipulate the replica to rewrite logs in `row based replication`. See [Migrating with Statement Based Replication](migrating-with-sbr.md).
24+
25+
The replica would have to use binary logs and be configured with `log_slave_updates`.
26+
27+
It is still OK to connect `gh-ost` directly on master; you will need to confirm this by providing `--allow-on-master`. The master would have to be using `row based replication`.
28+
29+
`gh-ost` itself may be executed from anywhere. It connects via `tcp` and it does not have to be executed from a `MySQL` box. However, do note it generates a lot of traffic, as it connects as a replica and pulls binary log data.
30+
1931
#### Testing on replica
2032

2133
```
@@ -29,6 +41,20 @@ Please read more on [testing on replica](testing-on-replica.md)
2941
gh-ost --conf=.my.cnf --database=mydb --table=mytable --verbose --alter="engine=innodb" --execute --initially-drop-ghost-table --initially-drop-old-table -max-load=Threads_connected=30 --switch-to-rbr --chunk-size=2500 --cut-over=two-step --exact-rowcount --verbose
3042
```
3143

44+
#### Recommended parameters
45+
46+
Run `gh-ost --help` to get full list of parameters. We like the following:
47+
48+
- `--exact-rowcount`: actually `select count(*)` from your table prior to migration, and heuristically maintain the updating table size while migrating. This makes for quite accurate assumption on progress. When `gh-ost` says it's `99.8%` done, it really there or very closely there.
49+
50+
- `--execute`: without this parameter, migration is a _noop_: testing table creation and validity of migration, but not touching data.
51+
52+
- `--initially-drop-ghost-table`, `--initially-drop-old-table`: `gh-ost` maintains two tables while migrating: the _ghost_ table (which is synced from your original table and finally replaces it) and a changelog table, which is used internally for bookkeeping. By default, it panics and aborts if it sees those tables upon startup. Provide these two params to let `gh-ost` know it's OK to drop them beforehand.
53+
54+
We think `gh-ost` should not take chances or make assumptions about the user's tables. Dropping tables can be a dangerous, locking operation. We let the user explicitly approve such operations.
55+
56+
- `--test-on-replica`: `gh-ost` can be tested on a replica, without actually modifying master data. We use this for testing, and we suspect new users of this tool would enjoy checking it out, building trust in this tool, before actually applying it on production masters. Read more on [testing on replica](testing-on-replica.md).
57+
3258
## What's in a name?
3359

3460
Originally this was named `gh-osc`: GitHub Online Schema Change, in the likes of [Facebook online schema change](https://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932/) and [pt-online-schema-change](https://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html).

build.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22
#
33
#
4-
RELEASE_VERSION="0.8.4"
4+
RELEASE_VERSION="0.8.5"
55

66
buildpath=/tmp/gh-ost
77
target=gh-ost

doc/testing-on-replica.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ $ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample
5252

5353
Elaborate:
5454
```shell
55-
$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --switch-to-rbr --initially-drop-ghost-table --initially-drop-old-table --cut-over=voluntary-lock --test-on-replica --postpone-swap-tables-flag-file=/tmp/ghost-postpone.flag --exact-rowcount --allow-nullable-unique-key --verbose --execute
55+
$ gh-osc --host=myhost.com --conf=/etc/gh-ost.cnf --database=test --table=sample_table --alter="engine=innodb" --chunk-size=2000 --max-load=Threads_connected=20 --switch-to-rbr --initially-drop-ghost-table --initially-drop-old-table --cut-over=voluntary-lock --test-on-replica --postpone-cut-over-flag-file=/tmp/ghost-postpone.flag --exact-rowcount --allow-nullable-unique-key --verbose --execute
5656
```
5757
- Count exact number of rows (makes ETA estimation very good). This goes at the expense of paying the time for issuing a `SELECT COUNT(*)` on your table. We use this lovingly.
5858
- Automatically switch to `RBR` if replica is configured as `SBR`. See also: [migrating with SBR](migrating-with-sbr.md)

doc/throttle.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Throttle
2+
3+
Throughout a migration operation, `gh-ost` is either actively copying and applying data, or is _throttling_.
4+
5+
When _throttled_, `gh-ost` ceases to write row data and ceases to inspect binary log entries. It pauses all writes except for the low-volume changelog status writes and the heartbeat writes.
6+
7+
As compared with trigger-based solutions, when `gh-ost` is throttled, the write load on the master is truly removed.
8+
9+
Typically, throttling is based on replication lag or on master load. At such time, you wish to reduce load from master and from replication by pausing the _ghost_ writes. However, with a trigger based solution this is impossible to achieve: the triggers must remain in place and they continue to generate excess writes while the table is being used.
10+
11+
Since `gh-ost` is not based on triggers, but of reading binary logs, it controls its own writes. Each and every write on the master comes from the `gh-ost` app, which means `gh-ost` is able to reduce writes to a bare minimum when it wishes so.
12+
13+
`gh-ost` supports various means for controlling throttling behavior; it is operations friendly in that it allows the user greater, dynamic control of throttler behavior.
14+
15+
### Throttling parameters and factors
16+
17+
Throttling is controlled via the following explicit and implicit factors:
18+
19+
#### Replication-lag
20+
21+
The recommended way of running `gh-ost` is by connecting it to a replica. It will figure out the master by traversing the topology. It is by design that `gh-ost` is throttle aware: it generates its own _heartbeat_ mechanism; while it is running the migration, it is self-checking the replica to which it is connected for replication lag.
22+
23+
Otherwise you may specify your own list of replica servers you wish it to observe.
24+
25+
- `--throttle-control-replicas`: list of replicas you explicitly wish `gh-ost` to check for replication lag.
26+
27+
Example: `--throttle-control-replicas=myhost1.com:3306,myhost2.com,myhost3.com:3307`
28+
29+
- `--max-lag-millis`: maximum allowed lag; any controlled replica lagging more than this value will cause throttling to kick in. When all control replicas have smaller lag than indicated, operation resumes.
30+
31+
- `--replication-lag-query`: `gh-ost` will, by default, issue a `show slave status` query to find replication lag. However, this is a notoriously flaky value. If you're using your own `heartbeat` mechanism, e.g. via [`pt-heartbeat`](https://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html), you may provide your own custom query to return a single `int` value indicating replication lag.
32+
33+
Example: `--replication-lag-query="SELECT ROUND(NOW() - MAX(UNIX_TIMESTAMP(ts))) AS lag FROM mydb.heartbeat"`
34+
35+
#### Status thresholds
36+
37+
- `--max-load`: list of metrics and threshold values; topping the threshold of any will cause throttler to kick in.
38+
39+
Example:
40+
41+
`--max-load='Threads_running=100,Threads_connected=500'`
42+
43+
Metrics must be valid, numeric [statis variables](http://dev.mysql.com/doc/refman/5.6/en/server-status-variables.html)
44+
45+
#### Manual control
46+
47+
In addition to the above, you are able to take control and throttle the operation any time you like.
48+
49+
- `--throttle-flag-file`: when this file exists, throttling kicks in. Just `touch` the file to begin throttling.
50+
51+
- `--throttle-additional-flag-file`: similar to the above. When this file exists, throttling kicks in.
52+
53+
Default: `/tmp/gh-ost.throttle`
54+
55+
The reason for having two files has to do with the intent of being able to run multiple migrations concurrently.
56+
The setup we wish to use is that each migration would have its own, specific `throttle-flag-file`, but all would use the same `throttle-additional-flag-file`. Thus, we are able to throttle specific migrations by touching their specific files, or we are able to throttle all migrations at once, by touching the shared file.
57+
58+
- `throttle` command via [interactive interface](interactive-commands.md).
59+
60+
Example: ```
61+
echo throttle | nc -U /tmp/gh-ost.test.sample_data_0.sock
62+
echo no-throttle | nc -U /tmp/gh-ost.test.sample_data_0.sock
63+
```
64+
65+
### Throttle precedence
66+
67+
Any single factor in the above that suggests the migration should throttle - causes throttling. That is, once some component decides to throttle, you cannot override it; you cannot force continued execution of the migration.
68+
69+
`gh-ost` will first check the low hanging fruits: user commanded; throttling files. It will then proceed to check replication lag, and lastly it will check for status thresholds.
70+
71+
The first check to suggest throttling stops the search; the status message will note the reason for throttling as the first satisfied check.
72+
73+
### Throttle status
74+
75+
The throttle status is printed as part of the periodic [status message](understanding-output.md):
76+
```
77+
Copy: 0/2915 0.0%; Applied: 0; Backlog: 0/100; Elapsed: 41s(copy), 41s(total); streamer: mysql-bin.000551:47983; ETA: throttled, flag-file
78+
Copy: 0/2915 0.0%; Applied: 0; Backlog: 0/100; Elapsed: 42s(copy), 42s(total); streamer: mysql-bin.000551:49370; ETA: throttled, commanded by user
79+
```

go/base/context.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ type MigrationContext struct {
6666
ThrottleAdditionalFlagFile string
6767
ThrottleCommandedByUser int64
6868
MaxLoad map[string]int64
69-
PostponeSwapTablesFlagFile string
69+
PostponeCutOverFlagFile string
7070
SwapTablesTimeoutSeconds int64
7171

7272
ServeSocketFile string

go/cmd/gh-ost/main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ func main() {
7171
throttleControlReplicas := flag.String("throttle-control-replicas", "", "List of replicas on which to check for lag; comma delimited. Example: myhost1.com:3306,myhost2.com,myhost3.com:3307")
7272
flag.StringVar(&migrationContext.ThrottleFlagFile, "throttle-flag-file", "", "operation pauses when this file exists; hint: use a file that is specific to the table being altered")
7373
flag.StringVar(&migrationContext.ThrottleAdditionalFlagFile, "throttle-additional-flag-file", "/tmp/gh-ost.throttle", "operation pauses when this file exists; hint: keep default, use for throttling multiple gh-ost operations")
74-
flag.StringVar(&migrationContext.PostponeSwapTablesFlagFile, "postpone-swap-tables-flag-file", "", "while this file exists, migration will postpone the final stage of swapping tables, and will keep on syncing the ghost table. Swapping would be ready to perform the moment the file is deleted.")
74+
flag.StringVar(&migrationContext.PostponeCutOverFlagFile, "postpone-cut-over-flag-file", "", "while this file exists, migration will postpone the final stage of swapping tables, and will keep on syncing the ghost table. Cut-over/swapping would be ready to perform the moment the file is deleted.")
7575

7676
flag.StringVar(&migrationContext.ServeSocketFile, "serve-socket-file", "", "Unix socket file to serve on. Default: auto-determined and advertised upon startup")
7777
flag.Int64Var(&migrationContext.ServeTCPPort, "serve-tcp-port", 0, "TCP port to serve on. Default: disabled")

go/logic/migrator.go

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -375,12 +375,12 @@ func (this *Migrator) stopWritesAndCompleteMigration() (err error) {
375375

376376
this.sleepWhileTrue(
377377
func() (bool, error) {
378-
if this.migrationContext.PostponeSwapTablesFlagFile == "" {
378+
if this.migrationContext.PostponeCutOverFlagFile == "" {
379379
return false, nil
380380
}
381-
if base.FileExists(this.migrationContext.PostponeSwapTablesFlagFile) {
381+
if base.FileExists(this.migrationContext.PostponeCutOverFlagFile) {
382382
// Throttle file defined and exists!
383-
log.Debugf("Postponing final table swap as flag file exists: %+v", this.migrationContext.PostponeSwapTablesFlagFile)
383+
log.Debugf("Postponing final table swap as flag file exists: %+v", this.migrationContext.PostponeCutOverFlagFile)
384384
return true, nil
385385
}
386386
return false, nil

0 commit comments

Comments
 (0)