|
| 1 | +# Chaos Testing |
| 2 | + |
| 3 | +We offer Docker and Kubernetes boilerplates designed to test the resilience of `NodeSet` and `Blockchain`, which you can customize and integrate into your pipeline. |
| 4 | + |
| 5 | + |
| 6 | +## Goals |
| 7 | + |
| 8 | +We recommend structuring your tests as a linear suite that applies various chaos experiments and verifies the outcomes using a [load](../../libs/wasp.md) testing suite. Focus on critical user metrics, such as: |
| 9 | + |
| 10 | +- The ratio of successful responses to failed responses |
| 11 | +- The nth percentile of response latency |
| 12 | + |
| 13 | +Next, evaluate observability: |
| 14 | + |
| 15 | +- Ensure proper alerts are triggered during failures (manual or automated) |
| 16 | +- Verify the service recovers within the expected timeframe (manual or automated) |
| 17 | + |
| 18 | +In summary, the **primary** focus is on meeting user expectations and maintaining SLAs, while the **secondary** focus is on observability and making operational part smoother. |
| 19 | + |
| 20 | + |
| 21 | +## Docker |
| 22 | + |
| 23 | +For Docker, we utilize [Pumba](https://github.com/alexei-led/pumba) to conduct chaos experiments, including: |
| 24 | + |
| 25 | +- Container reboots |
| 26 | +- Network simulations (such as delays, packet loss, corruption, etc., using the tc tool) |
| 27 | +- Stress testing for CPU and memory usage |
| 28 | + |
| 29 | +Additionally, we offer a [resources](../../framework/components/resources.md) API that allows you to test whether your software can operate effectively in low-resource environments. |
| 30 | + |
| 31 | +You can also use [fake](../../framework/components/mocking.md) package to create HTTP chaos experiments. |
| 32 | + |
| 33 | +Given the complexity of `Kubernetes`, we recommend starting with `Docker` first. Identifying faulty behavior in your services early—such as cascading latency—can prevent more severe issues when scaling up. Addressing these problems at a smaller scale can save significant time and effort later. |
| 34 | + |
| 35 | +Check `NodeSet` + `Blockchain` template [here](). |
| 36 | + |
| 37 | +## Kubernetes |
| 38 | + |
| 39 | +We utilize a subset of [ChaosMesh](https://chaos-mesh.org/) experiments that can be safely executed on an isolated node group. These include: |
| 40 | + |
| 41 | +- [Pod faults](https://chaos-mesh.org/docs/simulate-pod-chaos-on-kubernetes/) |
| 42 | + |
| 43 | +- [Network faults](https://chaos-mesh.org/docs/simulate-network-chaos-on-kubernetes/) – We focus on delay and partition experiments, as others may impact pods outside the dedicated node group. |
| 44 | + |
| 45 | +- [HTTP faults](https://chaos-mesh.org/docs/simulate-http-chaos-on-kubernetes/) |
| 46 | + |
| 47 | +Check `NodeSet` + `Blockchain` template [here](). |
| 48 | + |
| 49 | +## Blockchain |
| 50 | + |
| 51 | +We also offer a set of blockchain-specific experiments, which typically involve API calls to blockchain simulators to execute certain actions. These include: |
| 52 | + |
| 53 | +- Adjusting gas prices |
| 54 | + |
| 55 | +- Introducing chain reorganizations (setting a new head) |
| 56 | + |
| 57 | +- Utilizing developer APIs (e.g., Anvil) |
| 58 | + |
| 59 | +Check [gas]() and [reorg]() examples. |
0 commit comments