Skip to content

Commit 6bae5e5

Browse files
Merge pull request #2818 from ilianiliev-redis/RDSC-4633-ha-failover-setup
RDSC-4633 How to perform HA failover
2 parents b1f3d4d + 48e096a commit 6bae5e5

File tree

2 files changed

+84
-0
lines changed

2 files changed

+84
-0
lines changed
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
---
2+
Title: Test HA failover
3+
alwaysopen: false
4+
categories:
5+
- docs
6+
- integrate
7+
- rs
8+
- rdi
9+
description: Learn how to perform HA failover testing for Redis Data Integration (RDI) to ensure high availability and reliability of your data integration setup.
10+
group: di
11+
hideListLinks: false
12+
linkTitle: Test HA failover
13+
summary: How to perform HA failover testing
14+
type: integration
15+
weight: 100
16+
---
17+
18+
## Setup
19+
1. Ensure that RDI is up and running on both primary and secondary nodes.
20+
Run the following command and verify and that each instance should show healthy and running `rdi-api` and `rdi-operator` pods.
21+
```
22+
kubectl -n rdi get pods
23+
24+
# Example output:
25+
NAME READY STATUS RESTARTS AGE
26+
collector-api-577d95bfd8-5wbg6 1/1 Running 0 12m
27+
collector-source-95f45bcf7-vwn5l 1/1 Running 0 12m
28+
fluentd-zq2lc 1/1 Running 0 72m
29+
logrotate-29530445-j729x 0/1 Completed 0 14m
30+
logrotate-29530450-dprr2 0/1 Completed 0 9m40s
31+
logrotate-29530455-mfmzw 0/1 Completed 0 4m40s
32+
processor-f66655469-h7nw2 1/1 Running 0 12m
33+
rdi-api-f75df6796-qwqjw 1/1 Running 0 72m
34+
rdi-metrics-exporter-d57cdf8c8-wjzb5 1/1 Running 0 72m
35+
rdi-operator-7f7f6c7dfd-5qmjd 1/1 Running 0 71m
36+
rdi-reloader-77df5f7854-lwmvz 1/1 Running 0 71m
37+
```
38+
39+
2. Identify the leader node - this is the one that has a running `collector-source` pod.
40+
41+
## Performing the HA Failover Testing
42+
43+
To perform HA, you can simulate a connection failure between the leader and the RDI database by blocking the network traffic. You can do this by running the following commands on the leader node:
44+
45+
1. Identify the RDI database IP (replace `<hostname>` with your own hostname):
46+
```
47+
dig +short <hostname>
48+
49+
# Example:
50+
# dig +short my.redis.hostname.com
51+
52+
# Example output:
53+
54.78.220.161
54+
```
55+
56+
2. For each of the IPs returned by the above command, run the following command to block the traffic:
57+
58+
```
59+
sudo iptables -I FORWARD -d <database_ip> -j DROP
60+
61+
# With the IP from the example above, the command would be:
62+
sudo iptables -I FORWARD -d 54.78.220.161 -j DROP
63+
```
64+
65+
66+
The default configuration for the leader lock is 60 seconds, so it may take up to 2 minutes for the failover to occur.
67+
Meanwhile you can follow the logs of the operator to see the failover process:
68+
69+
```
70+
kubectl -n rdi logs rdi-operator-7f7f6c7dfd-5qmjd -f
71+
```
72+
73+
In about 10 seconds you will start seeing log entries from the leader saying that it could not acquire the leadership.
74+
When the leader lock expires, the second node will acquire the leadership and you will see log entries from the second node indicating that it has become the leader.
75+
76+
## Cleanup
77+
78+
To clean up after the test, remove the `iptables` rule that you added to block the traffic:
79+
80+
```sudo iptables -D FORWARD -d <databse_ip> -j DROP```
81+
82+
Use `sudo iptables -S | grep <database_ip>` to verify that the rule has been removed.

content/integrate/redis-data-integration/installation/install-vm.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,6 +269,8 @@ to renew the lease in the RDI database, it will lose the leadership and a failov
269269
will take place. After the failover, the secondary instance will become the primary one,
270270
and the RDI pipeline will be active on that VM.
271271

272+
You may find it useful to trigger a failover deliberately to check that RDI is correctly configured to handle it. See [Test HA failover]({{< relref "/integrate/redis-data-integration/installation/ha-test" >}}) to learn how to do this.
273+
272274
## Prepare your source database
273275

274276
Before deploying a pipeline, you must configure your source database to enable CDC. See the

0 commit comments

Comments
 (0)