-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Open
Labels
Description
We are looking to improve the performance of gh-ost
so that we can safely operate with larger database sizes. Our use case is a bit different to GitHub’s in that we do not use read-replicas. Some of the DBs have very cyclic usage as well (i.e. only busy 9-5 M-F), and may have windows of free capacity.
I have a few ideas I wanted to run by you since I’m sure some have come up prior:
Feature | Context | Status |
---|---|---|
Use 8.0 ALGORITHM=INSTANT when applicable | MySQL 8.0 does not require gh-ost in some cases. | Merged, thank you kindly for the review! |
Dynamic Chunk Size | Gh-ost can observe the exec time of processing chunks, and dynamically increase the size if it fits below a threshold. For our environment (because we have a lot of replica tolerance) we typically run larger batch sizes, but have varying DB instance sizes. Being able to have this auto-tune is a win for us. | See PR here (and issue comment) |
Multi-threaded applier | Parallel replication apply is much better in MySQL 8.0 – so combined with that we don’t use read-replicas, we can probably push more changes through the binlog than gh-ost currently does. We think we can tolerate a few minutes of replica lag. Our limit is Aurora restricts the relay log to ~1000M, if we exceed that.. we reduce our DR capabilities. (Note: there's an earlier issue on this. It lacks the 8.0 parallel context, and the issue @shlomi-noach probably hit when he said it is slower, is possibly this one? In any case, I've verified I can bulk-parallel insert with improved performance.) |
Not started |
Defer Binary Log Apply | Currently gh-ost prioritizes applying the binary log ahead of copying rows. I actually think it’s possible to track only the primary keys that were discovered in the binary log in memory + if the last modification was a delete or not (bool). If this is kept in a map, then it can be applied after the copy is done. The benefit of this change is most evident in workloads that tend to update the same rows. Edit: This optimization requires mem-comparable primary keys. So it won't work on varchar primary keys with collations. | Not started |
Resume from failure | I know there is a stale PR for this. This doesn’t improve the performance, but it’s semi-related since some of our long running DDLs fail. We also like to use daily pod-cycling on our k8s clusters, so having 2 week long single processes complicates our infra. | See branch here. |
Better ETA estimates | The current ETA estimator is based on estimatedTime - elapsedTime from the start of the copy. This skews poorly for larger tables, which become slower to insert into. As dynamic chunk size/throttling is introduced it also doesn't respond to changes well with a more accurate estimate. Ideally the estimate evaluates how many rows are left to copy and compares that to how many rows were copied in the last few minutes. |
See PR here |
That's the raw idea list - there is a good chance we will be able to provide patches for some of these too, but I wanted to check-in first so we can discuss. Maybe you have a few of your own ideas too? :-)
Xopherus, lmtwga, hhyo and dnovitski