From 0e79c72715ca597e3cdc3d7afa8eb97a2f5d73be Mon Sep 17 00:00:00 2001 From: xiang Date: Thu, 7 Apr 2022 16:16:20 +0800 Subject: [PATCH 1/5] add network loss example Signed-off-by: xiang --- README.md | 2 +- network/loss.md | 102 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 network/loss.md diff --git a/README.md b/README.md index 2c3ce6e..6c84cb8 100644 --- a/README.md +++ b/README.md @@ -32,7 +32,7 @@ See [fotmat of record](./format.md) for details. - [Network Delay](./network/delay.md) - [Network Bandwidth](./network/bandwidth.md) - [Network Partition](./network/partition.md) -- Network Packet Loss +- [Network Packet Loss](./network/loss.md) - Network Packet Reorder - Network Packet Duplicate - Network Packet Corrupt diff --git a/network/loss.md b/network/loss.md new file mode 100644 index 0000000..0525755 --- /dev/null +++ b/network/loss.md @@ -0,0 +1,102 @@ +# Test Network Chaos on TiDB + +## Network Loss + +### Description + +Test the availability of TiDB cluster in network package loss scenarios. + +### Hypothesis + +When the network package loss occurs between TiKV nodes, the QPS/TPS of TiDB will drop significantly, but it can still provide services normally. + +### Preparation + +1. Install chaos-mesh. +2. A TiDB cluster on k8s with at least 3 TiKV Pods, deploy with [TiDB operator](https://docs.pingcap.com/tidb-in-kubernetes/stable/tidb-operator-overview). +3. Running payload on TiDB cluster. + +### Quick start + +1. Chaos Mesh fault YAML configuration: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: chaos-testing + name: network-loss +spec: + selector: + pods: + tidb-cluster: + - basic-tikv-0 + mode: all + action: loss + loss: + loss: '10' + correlation: '50' + direction: to + target: + selector: + pods: + tidb-cluster: + - basic-tikv-1 + - basic-tikv-2 + mode: all +``` + +Saving the YAML configuration above into file network-loss.yaml. + +2. Using Kubectl to create the experiment: + +``` +kubectl create -f network-loss.yaml +``` + +3. Verifying TiDB's status: + + Check QPS & TPS of TiDB in Grafana. + + +4. Result: + +Judge whether the hypothesis is correct or not based on the results of the test process. + +### More example + +You can test more scenarios by using Chaos Mesh. For example: + +- Network package loss occurs between TiDB and TiKV. +- Network package loss occurs between TiKV and PD. +- Network package loss occurs between TiDB nodes. +- Network package loss occurs between PD nodes. + +All you need to do is adjust the `selector` and `target` in the YAML configuration. For injecting network package loss between TiDB and TiKV, the YAML configuration looks like below: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: chaos-testing + name: network-loss +spec: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tidb + mode: all + action: loss + loss: + loss: '10' + correlation: '50' + direction: to + target: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tikv + mode: all +``` From 5d66f37a46beef40ec10577176e7efffbc89f140 Mon Sep 17 00:00:00 2001 From: xiang Date: Thu, 7 Apr 2022 16:42:36 +0800 Subject: [PATCH 2/5] add network reorder example Signed-off-by: xiang --- README.md | 2 +- network/reorder.md | 114 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 network/reorder.md diff --git a/README.md b/README.md index 6c84cb8..ad06c8f 100644 --- a/README.md +++ b/README.md @@ -33,7 +33,7 @@ See [fotmat of record](./format.md) for details. - [Network Bandwidth](./network/bandwidth.md) - [Network Partition](./network/partition.md) - [Network Packet Loss](./network/loss.md) -- Network Packet Reorder +- [Network Packet Reorder](./network/reorder.md) - Network Packet Duplicate - Network Packet Corrupt diff --git a/network/reorder.md b/network/reorder.md new file mode 100644 index 0000000..54ba7d4 --- /dev/null +++ b/network/reorder.md @@ -0,0 +1,114 @@ +# Test Network Chaos on TiDB + +## Network Reorder + +### Description + +Test the availability of TiDB cluster in network package reorder scenarios. + +### Hypothesis + +When the network package reorder occurs between TiDB and TiKV nodes, the QPS/TPS of TiDB will drop significantly, but it can still provide services normally. + +### Preparation + +1. Install chaos-mesh. +2. A TiDB cluster on k8s with at least 3 TiKV Pods, deploy with [TiDB operator](https://docs.pingcap.com/tidb-in-kubernetes/stable/tidb-operator-overview). +3. Running payload on TiDB cluster. + +### Quick start + +1. Chaos Mesh fault YAML configuration: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-reorder +spec: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tidb + mode: all + action: delay + duration: 10m + delay: + latency: 50ms + correlation: '0' + jitter: 0ms + reorder: + reorder: 50 + correlation: 50 + gap: 5 + direction: both + target: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tikv + mode: all +``` + +Saving the YAML configuration above into file network-reorder.yaml. + +2. Using Kubectl to create the experiment: + +``` +kubectl create -f network-reorder.yaml +``` + +3. Verifying TiDB's status: + + Check QPS & TPS of TiDB in Grafana. + + +4. Result: + +Judge whether the hypothesis is correct or not based on the results of the test process. + +### More example + +You can test more scenarios by using Chaos Mesh. For example: + +- Network package reorder occurs between TiDB and TiKV. +- Network package reorder occurs between TiKV and PD. +- Network package reorder occurs between TiKV nodes. +- Network package reorder occurs between PD nodes. + +All you need to do is adjust the `selector` and `target` in the YAML configuration. For inject network package reorder between TiKV nodes, the YAML configuration looks like below: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-reorder +spec: + selector: + pods: + tidb-cluster: + - basic-tikv-0 + mode: all + action: delay + duration: 10m + delay: + latency: 50ms + correlation: '0' + jitter: 0ms + reorder: + reorder: 50 + correlation: 50 + gap: 5 + direction: both + target: + selector: + pods: + tidb-cluster: + - basic-tikv-1 + - basic-tikv-2 + mode: all +``` From ea5ea8e6833531936b73dcd3e1e6a54595f195c6 Mon Sep 17 00:00:00 2001 From: xiang Date: Thu, 7 Apr 2022 16:52:36 +0800 Subject: [PATCH 3/5] add network duplicate example Signed-off-by: xiang --- README.md | 2 +- network/duplicate.md | 102 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 network/duplicate.md diff --git a/README.md b/README.md index ad06c8f..6ce1dbd 100644 --- a/README.md +++ b/README.md @@ -34,7 +34,7 @@ See [fotmat of record](./format.md) for details. - [Network Partition](./network/partition.md) - [Network Packet Loss](./network/loss.md) - [Network Packet Reorder](./network/reorder.md) -- Network Packet Duplicate +- [Network Packet Duplicate](./network/duplicate.md) - Network Packet Corrupt ### IO diff --git a/network/duplicate.md b/network/duplicate.md new file mode 100644 index 0000000..05ed60f --- /dev/null +++ b/network/duplicate.md @@ -0,0 +1,102 @@ +# Test Network Chaos on TiDB + +## Network Duplicate + +### Description + +Test the availability of TiDB cluster in network package duplicate scenarios. + +### Hypothesis + +When the network package duplicate occurs between TiDB and TiKV nodes, the QPS/TPS of TiDB will drop significantly, but it can still provide services normally. + +### Preparation + +1. Install chaos-mesh. +2. A TiDB cluster on k8s with at least 3 TiKV Pods, deploy with [TiDB operator](https://docs.pingcap.com/tidb-in-kubernetes/stable/tidb-operator-overview). +3. Running payload on TiDB cluster. + +### Quick start + +1. Chaos Mesh fault YAML configuration: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-duplicate +spec: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tidb + mode: all + action: duplicate + duplicate: + duplicate: '50' + correlation: '50' + direction: both + target: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tikv + mode: all +``` + +Saving the YAML configuration above into file network-duplicate.yaml. + +2. Using Kubectl to create the experiment: + +``` +kubectl create -f network-duplicate.yaml +``` + +3. Verifying TiDB's status: + + Check QPS & TPS of TiDB in Grafana. + + +4. Result: + +Judge whether the hypothesis is correct or not based on the results of the test process. + +### More example + +You can test more scenarios by using Chaos Mesh. For example: + +- Network package duplicate occurs between TiDB and TiKV. +- Network package duplicate occurs between TiKV and PD. +- Network package duplicate occurs between TiKV nodes. +- Network package duplicate occurs between PD nodes. + +All you need to do is adjust the `selector` and `target` in the YAML configuration. For inject network package duplicate between TiKV nodes, the YAML configuration looks like below: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-duplicate +spec: + selector: + pods: + tidb-cluster: + - basic-tikv-0 + mode: all + action: duplicate + duplicate: + duplicate: '50' + correlation: '50' + direction: both + target: + selector: + pods: + tidb-cluster: + - basic-tikv-1 + - basic-tikv-2 + mode: all +``` From 69a5967e53783ec23c3cadcff714498f743b49d5 Mon Sep 17 00:00:00 2001 From: xiang Date: Thu, 7 Apr 2022 17:25:34 +0800 Subject: [PATCH 4/5] add network reorder example Signed-off-by: xiang --- README.md | 2 +- network/bandwidth.md | 4 +- network/corrupt.md | 102 +++++++++++++++++++++++++++++++++++++++++++ network/delay.md | 4 +- network/duplicate.md | 4 +- network/partition.md | 6 +-- network/reorder.md | 4 +- 7 files changed, 114 insertions(+), 12 deletions(-) create mode 100644 network/corrupt.md diff --git a/README.md b/README.md index 6ce1dbd..4a4da38 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ See [fotmat of record](./format.md) for details. - [Network Packet Loss](./network/loss.md) - [Network Packet Reorder](./network/reorder.md) - [Network Packet Duplicate](./network/duplicate.md) -- Network Packet Corrupt +- [Network Packet Corrupt](./network/corrupt.md) ### IO diff --git a/network/bandwidth.md b/network/bandwidth.md index 1facb34..b23bc96 100644 --- a/network/bandwidth.md +++ b/network/bandwidth.md @@ -4,7 +4,7 @@ ### Description -Test the availability of TiDB cluster in network bandwidth scenarios. +Test the availability of the TiDB cluster in network bandwidth scenarios. ### Hypothesis @@ -75,7 +75,7 @@ You can test more scenarios by using Chaos Mesh. For example: - Limiting the network bandwidth between TiDB nodes. - Limiting the network bandwidth between TiKV nodes. -All you need to do is adjust the `selector` and `target` in the YAML configuration. For limiting the network bandwidth between TiKV nodes, the YAML configuration looks like below: +All you need to do is adjust the `selector` and `target` in the YAML configuration. For limiting the network bandwidth between TiKV nodes, the YAML configuration looks like the below: ```YAML kind: NetworkChaos diff --git a/network/corrupt.md b/network/corrupt.md new file mode 100644 index 0000000..4f6d07b --- /dev/null +++ b/network/corrupt.md @@ -0,0 +1,102 @@ +# Test Network Chaos on TiDB + +## Network Corrupt + +### Description + +Test the availability of the TiDB cluster in network package corrupt scenarios. + +### Hypothesis + +When the network package corruption occurs between TiDB and TiKV nodes, the QPS/TPS of TiDB will drop significantly, but it can still provide services normally. + +### Preparation + +1. Install chaos-mesh. +2. A TiDB cluster on k8s with at least 3 TiKV Pods, deploy with [TiDB operator](https://docs.pingcap.com/tidb-in-kubernetes/stable/tidb-operator-overview). +3. Running payload on TiDB cluster. + +### Quick start + +1. Chaos Mesh fault YAML configuration: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-corrupt +spec: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tidb + mode: all + action: corrupt + corrupt: + corrupt: '50' + correlation: '50' + direction: both + target: + selector: + namespaces: + - tidb-cluster + labelSelectors: + app.kubernetes.io/component: tikv + mode: all +``` + +Saving the YAML configuration above into file network-corrupt.yaml. + +2. Using Kubectl to create the experiment: + +``` +kubectl create -f network-corrupt.yaml +``` + +3. Verifying TiDB's status: + + Check QPS & TPS of TiDB in Grafana. + + +4. Result: + +Judge whether the hypothesis is correct or not based on the results of the test process. + +### More example + +You can test more scenarios by using Chaos Mesh. For example: + +- Network package corruption occurs between TiDB and TiKV. +- Network package corruption occurs between TiKV and PD. +- Network package corruption occurs between TiKV nodes. +- Network package corruption occurs between PD nodes. + +All you need to do is adjust the `selector` and `target` in the YAML configuration. To inject network package corrupt between TiKV nodes, the YAML configuration looks like the below: + +```YAML +kind: NetworkChaos +apiVersion: chaos-mesh.org/v1alpha1 +metadata: + namespace: tidb-cluster + name: network-corrupt +spec: + selector: + pods: + tidb-cluster: + - basic-tikv-0 + mode: all + action: corrupt + corrupt: + corrupt: '50' + correlation: '50' + direction: both + target: + selector: + pods: + tidb-cluster: + - basic-tikv-1 + - basic-tikv-2 + mode: all +``` diff --git a/network/delay.md b/network/delay.md index 992b873..d40a83c 100644 --- a/network/delay.md +++ b/network/delay.md @@ -4,7 +4,7 @@ ### Description -Test the availability of TiDB cluster in network delay scenarios. +Test the availability of the TiDB cluster in network delay scenarios. ### Hypothesis @@ -75,7 +75,7 @@ You can test more scenarios by using Chaos Mesh. For example: - Increasing network delay between TiKV nodes. - Increasing network delay between PD nodes. -All you need to do is adjust the `selector` and `target` in the YAML configuration. For increasing network delay between TiKV nodes, the YAML configuration looks like below: +All you need to do is adjust the `selector` and `target` in the YAML configuration. For increasing network delay between TiKV nodes, the YAML configuration looks like the below: ```YAML kind: NetworkChaos diff --git a/network/duplicate.md b/network/duplicate.md index 05ed60f..4a66e7b 100644 --- a/network/duplicate.md +++ b/network/duplicate.md @@ -4,7 +4,7 @@ ### Description -Test the availability of TiDB cluster in network package duplicate scenarios. +Test the availability of the TiDB cluster in network package duplicate scenarios. ### Hypothesis @@ -73,7 +73,7 @@ You can test more scenarios by using Chaos Mesh. For example: - Network package duplicate occurs between TiKV nodes. - Network package duplicate occurs between PD nodes. -All you need to do is adjust the `selector` and `target` in the YAML configuration. For inject network package duplicate between TiKV nodes, the YAML configuration looks like below: +All you need to do is adjust the `selector` and `target` in the YAML configuration. To inject network package duplicate between TiKV nodes, the YAML configuration looks like the below: ```YAML kind: NetworkChaos diff --git a/network/partition.md b/network/partition.md index 18139f5..dee9e46 100644 --- a/network/partition.md +++ b/network/partition.md @@ -4,11 +4,11 @@ ### Description -Test the availability of TiDB cluster in network partition scenarios. +Test the availability of the TiDB cluster in network partition scenarios. ### Hypothesis -When a network partition occurs between a TiKV node and other TiKV nodes in the cluster, the QPS/TPS drop significantly, and then recover to normal levels, and the data is consistent. +When a network partition occurs between a TiKV node and other TiKV nodes in the cluster, the QPS/TPS drop significantly and then recover to normal levels, and the data is consistent. ### Preparation @@ -70,7 +70,7 @@ You can test more scenarios by using Chaos Mesh. For example: - Network partition occurs between TiDB and TiKV. - Network partition occurs between PD and TiKV. -All you need to do is adjust the `selector` and `target` in the YAML configuration. For Network partition occurs between TiDB and TiKV, the YAML configuration looks like below: +All you need to do is adjust the `selector` and `target` in the YAML configuration. For network partition that occurs between TiDB and TiKV, the YAML configuration looks like the below: ```YAML kind: NetworkChaos diff --git a/network/reorder.md b/network/reorder.md index 54ba7d4..c9cbf59 100644 --- a/network/reorder.md +++ b/network/reorder.md @@ -4,7 +4,7 @@ ### Description -Test the availability of TiDB cluster in network package reorder scenarios. +Test the availability of the TiDB cluster in network package reorder scenarios. ### Hypothesis @@ -79,7 +79,7 @@ You can test more scenarios by using Chaos Mesh. For example: - Network package reorder occurs between TiKV nodes. - Network package reorder occurs between PD nodes. -All you need to do is adjust the `selector` and `target` in the YAML configuration. For inject network package reorder between TiKV nodes, the YAML configuration looks like below: +All you need to do is adjust the `selector` and `target` in the YAML configuration. For inject network package reorder between TiKV nodes, the YAML configuration looks like the below: ```YAML kind: NetworkChaos From 3d0e43deac880eb4b0109be4160f31b1ceae2ae9 Mon Sep 17 00:00:00 2001 From: xiang Date: Fri, 22 Apr 2022 10:02:32 +0800 Subject: [PATCH 5/5] address comment Signed-off-by: xiang --- network/loss.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/network/loss.md b/network/loss.md index 0525755..728435a 100644 --- a/network/loss.md +++ b/network/loss.md @@ -4,7 +4,7 @@ ### Description -Test the availability of TiDB cluster in network package loss scenarios. +Test the availability of the TiDB cluster in network package loss scenarios. ### Hypothesis