Skip to content

Commit 1e5eb61

Browse files
Merge pull request #102 from genestack/feature/ODM-12343-rebalancing-doc
ODM-12343 Add rebalancing doc
2 parents 47af585 + 426ddaa commit 1e5eb61

File tree

3 files changed

+111
-0
lines changed

3 files changed

+111
-0
lines changed
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
core:
2+
files:
3+
"/var/lib/genestack/properties/application.yaml":
4+
backend:
5+
clickhouse:
6+
main:
7+
url: "jdbc:clickhouse://{{ include \"odm.clickhouseHosts\" (dict \"port\" 8123 \"global\" $) }}/genestack_new?socket_timeout=1800000&dataTransferTimeout=1800000&maxQuerySize=20971520&createDatabaseIfNotExist=true&load_balancing_policy=roundRobin"
8+
applications:
9+
files:
10+
"/var/lib/genestack/properties/application.yaml":
11+
frontend:
12+
clickhouse:
13+
main:
14+
url: "jdbc:clickhouse://{{ include \"odm.clickhouseHosts\" (dict \"port\" 8123 \"global\" $) }}/genestack_new?socket_timeout=1800000&dataTransferTimeout=1800000&maxQuerySize=20971520&createDatabaseIfNotExist=true&load_balancing_policy=roundRobin"
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# ClickHouse Rebalancing
2+
3+
Rebalancing shards in ClickHouse is primarily a manual process due to inherent [limitations](https://clickhouse.com/docs/en/guides/sre/scaling-clusters) in ClickHouse. To simplify this process, we have developed a tool to assist with shard rebalancing.
4+
5+
## Prerequisites
6+
7+
- Ensure there are no running ODM tasks. Wait for all tasks to complete before proceeding. This step is crucial to maintain data consistency in ClickHouse.
8+
- Ensure that there is enough free space in the ClickHouse cluster. All rebalanced data should be distributed equally across the nodes.
9+
- Make sure ODM version is 1.60 or higher.
10+
- Make sure `clickhouse-helper` version is higher than 0.30.0.
11+
12+
## Just to be sure
13+
14+
You can use [sanity check](../troubleshooting/sanity-check.md) just to doublecheck that data is consistent in ODM.
15+
16+
## Steps for Rebalancing
17+
18+
### 1. Enable ClickHouse Read-Only Mode in ODM
19+
20+
Set ODM to read-only mode to prevent any write operations during the rebalancing process. This does not affect schema migrations.
21+
22+
```shell
23+
export ODM_CORE_URL=http://<ODM_CORE_HOST>:<ODM_CORE_PORT>
24+
docker run \
25+
--env ODM_CORE_URL=${ODM_CORE_URL} \
26+
091468197733.dkr.ecr.us-east-1.amazonaws.com/genestack/clickhouse-helper \
27+
odm readonly --set-value=true
28+
```
29+
30+
### 2. Redeploy Services with the New ClickHouse Database
31+
32+
Update your Helm values to point to the new ClickHouse database and redeploy the `core` and `applications` services.
33+
34+
#### a) Update Helm Values
35+
36+
Refer to the example values file patch for guidance: [clickhouse-new-database.yaml](files/clickhouse-new-database.yaml). Use the `genestack_new` database name.
37+
38+
#### b) Perform Helm Upgrade
39+
40+
Run the following command to apply the changes:
41+
42+
```shell
43+
helm upgrade <release-name> <chart-name> -f values.yaml
44+
```
45+
46+
### 3. Clone Data to the New Database
47+
48+
Use the `clickhouse-helper` tool to copy data from the old database to the new one. Both `CH_SOURCE_URL` and `CH_DESTINATION_URL` can accept multiple nodes separated by a comma (`,`), for example, `localhost:9000,localhost:19000`. **It is recommended to include all nodes in the cluster**.
49+
50+
Follow these steps:
51+
52+
1. Set the source and destination ClickHouse server URLs:
53+
54+
```shell
55+
export CH_SOURCE_URL=<SOURCE_CLICKHOUSE_HOST>:<SOURCE_CLICKHOUSE_PORT>
56+
export CH_DESTINATION_URL=<DESTINATION_CLICKHOUSE_HOST>:<DESTINATION_CLICKHOUSE_PORT>
57+
```
58+
59+
2. Set the source and destination database names:
60+
61+
```shell
62+
export CH_SOURCE_DATABASE=genestack
63+
export CH_DESTINATION_DATABASE=genestack_new
64+
```
65+
66+
3. Run the `clickhouse-helper` to clone the data:
67+
68+
```shell
69+
docker run \
70+
--env CH_SOURCE_URL=${CH_SOURCE_URL} \
71+
--env CH_DESTINATION_URL=${CH_DESTINATION_URL} \
72+
--env CH_SOURCE_DATABASE=${CH_SOURCE_DATABASE} \
73+
--env CH_DESTINATION_DATABASE=${CH_DESTINATION_DATABASE} \
74+
091468197733.dkr.ecr.us-east-1.amazonaws.com/genestack/clickhouse-helper \
75+
ch clone
76+
```
77+
78+
### 4. Disable ClickHouse Read-Only Mode in ODM
79+
80+
Once the data cloning is complete, re-enable write operations in ODM.
81+
82+
```shell
83+
export ODM_CORE_URL=http://<ODM_CORE_HOST>:<ODM_CORE_PORT>
84+
docker run \
85+
--env ODM_CORE_URL=${ODM_CORE_URL} \
86+
091468197733.dkr.ecr.us-east-1.amazonaws.com/genestack/clickhouse-helper \
87+
odm readonly --set-value=false
88+
```
89+
90+
## Notes
91+
92+
- Ensure all steps are followed in sequence to avoid data inconsistencies.
93+
- The `clickhouse-helper` tool is essential for simplifying the rebalancing process.
94+
- Remember to delete the old database from ClickHouse after the rebalancing process is complete.
95+
It can be done with `clickhouse-client` command-line tool.

mkdocs.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ nav:
2121
- Microsoft Azure: home/single-sign-on/scim/azure.md
2222
- Helm:
2323
- How to deploy: home/helm/how-to-deploy.md
24+
- Clickhouse:
25+
- Rebalancing: home/clickhouse/rebalancing.md
2426
- Troubleshooting:
2527
- AWS S3: home/troubleshooting/aws-s3.md
2628
- Azure SSO: home/troubleshooting/azure-sso.md

0 commit comments

Comments
 (0)