|
| 1 | +--- |
| 2 | +title: Create an experiment that uses an AKS Chaos Mesh fault using Azure Chaos Studio with the Azure CLI |
| 3 | +description: Create an experiment that uses an AKS Chaos Mesh fault with the Azure CLI |
| 4 | +author: johnkemnetz |
| 5 | +ms.topic: how-to |
| 6 | +ms.date: 11/11/2021 |
| 7 | +ms.author: johnkem |
| 8 | +ms.service: chaos-studio |
| 9 | +ms.custom: template-how-to, ignite-fall-2021 |
| 10 | +--- |
| 11 | + |
| 12 | +# Create a chaos experiment that uses a Chaos Mesh fault with the Azure CLI |
| 13 | + |
| 14 | +You can use a chaos experiment to verify that your application is resilient to failures by causing those failures in a controlled environment. In this guide, you will cause periodic Azure Kubernetes Service pod failures on a namespace using a chaos experiment and Azure Chaos Studio. Running this experiment can help you defend against service unavailability when there are sporadic failures. |
| 15 | + |
| 16 | +Azure Chaos Studio uses [Chaos Mesh](https://chaos-mesh.org/), a free, open-source chaos engineering platform for Kubernetes to inject faults into an AKS cluster. Chaos Mesh faults are [service-direct](chaos-studio-tutorial-aks-portal.md) faults that require Chaos Mesh to be installed on the AKS cluster. These same steps can be used to set up and run an experiment for any AKS Chaos Mesh fault. |
| 17 | + |
| 18 | +## Prerequisites |
| 19 | + |
| 20 | +- An Azure subscription. [!INCLUDE [quickstarts-free-trial-note](../../includes/quickstarts-free-trial-note.md)] |
| 21 | +- An AKS cluster. If you do not have an AKS cluster, you can [follow these steps to create one](../aks/kubernetes-walkthrough-portal.md). |
| 22 | + |
| 23 | +## Launch Azure Cloud Shell |
| 24 | + |
| 25 | +The Azure Cloud Shell is a free interactive shell that you can use to run the steps in this article. It has common Azure tools preinstalled and configured to use with your account. |
| 26 | + |
| 27 | +To open the Cloud Shell, just select **Try it** from the upper right corner of a code block. You can also open Cloud Shell in a separate browser tab by going to [https://shell.azure.com/bash](https://shell.azure.com/bash). Select **Copy** to copy the blocks of code, paste it into the Cloud Shell, and select **Enter** to run it. |
| 28 | + |
| 29 | +If you prefer to install and use the CLI locally, this tutorial requires Azure CLI version 2.0.30 or later. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI]( /cli/azure/install-azure-cli). |
| 30 | + |
| 31 | +## Set up Chaos Mesh on your AKS cluster |
| 32 | + |
| 33 | +Before you can run Chaos Mesh faults in Chaos Studio, you need to install Chaos Mesh on your AKS cluster. |
| 34 | + |
| 35 | +1. Run the following commands in an [Azure Cloud Shell](../cloud-shell/overview.md) window where you have the active subscription set to be the subscription where your AKS cluster is deployed. Replace `$RESOURCE_GROUP` and `$CLUSTER_NAME` with the resource group and name of your cluster resource. |
| 36 | + |
| 37 | + ```azurecli-interactive |
| 38 | + az aks get-credentials -g $RESOURCE_GROUP -n $CLUSTER_NAME |
| 39 | + helm repo add chaos-mesh https://charts.chaos-mesh.org |
| 40 | + helm repo update |
| 41 | + kubectl create ns chaos-testing |
| 42 | + helm install chaos-mesh chaos-mesh/chaos-mesh --namespace=chaos-testing --version 2.0.3 --set chaosDaemon.runtime=containerd --set chaosDaemon.socketPath=/run/containerd/containerd.sock |
| 43 | + ``` |
| 44 | + |
| 45 | +2. Verify that the Chaos Mesh pods are installed by running the following command: |
| 46 | + |
| 47 | + ```azurecli-interactive |
| 48 | + kubectl get po -n chaos-testing |
| 49 | + ``` |
| 50 | + |
| 51 | +You should see output similar to the following (a chaos-controller-manager and one or more chaos-daemons): |
| 52 | + |
| 53 | +```bash |
| 54 | +NAME READY STATUS RESTARTS AGE |
| 55 | +chaos-controller-manager-69fd5c46c8-xlqpc 1/1 Running 0 2d5h |
| 56 | +chaos-daemon-jb8xh 1/1 Running 0 2d5h |
| 57 | +chaos-dashboard-98c4c5f97-tx5ds 1/1 Running 0 2d5h |
| 58 | +``` |
| 59 | + |
| 60 | +You can also [use the installation instructions on the Chaos Mesh website](https://chaos-mesh.org/docs/production-installation-using-helm/). |
| 61 | + |
| 62 | + |
| 63 | +## Enable Chaos Studio on your AKS cluster |
| 64 | + |
| 65 | +Chaos Studio cannot inject faults against a resource unless that resource has been onboarded to Chaos Studio first. You onboard a resource to Chaos Studio by creating a [target and capabilities](chaos-studio-targets-capabilities.md) on the resource. AKS clusters only have one target type (service-direct), but other resources may have up to two target types - one for service-direct faults and one for agent-based faults. Each type of Chaos Mesh fault is represented as a capability (PodChaos, NetworkChaos, IOChaos, etc.). |
| 66 | + |
| 67 | +1. Create a target by replacing `$RESOURCE_ID` with the resource ID of the AKS cluster you are onboarding: |
| 68 | + |
| 69 | + ```azurecli-interactive |
| 70 | + az rest --method put --url "https://management.azure.com/$RESOURCE_ID/providers/Microsoft.Chaos/targets/Microsoft-AzureKubernetesServiceChaosMesh?api-version=2021-09-15-preview" --body "{\"properties\":{}}" |
| 71 | + ``` |
| 72 | +
|
| 73 | +2. Create the capabilities on the target by replacing `$RESOURCE_ID` with the resource ID of the AKS cluster you are onboarding and `$CAPABILITY` with the [name of the fault capability you are enabling](chaos-studio-fault-library.md). |
| 74 | + |
| 75 | + ```azurecli-interactive |
| 76 | + az rest --method put --url "https://management.azure.com/$RESOURCE_ID/providers/Microsoft.Chaos/targets/Microsoft-AzureKubernetesServiceChaosMesh/capabilities/$CAPABILITY?api-version=2021-09-15-preview" --body "{\"properties\":{}}" |
| 77 | + ``` |
| 78 | +
|
| 79 | + For example, if enabling the PodChaos capability: |
| 80 | +
|
| 81 | + ```azurecli-interactive |
| 82 | + az rest --method put --url "https://management.azure.com/subscriptions/b65f2fec-d6b2-4edd-817e-9339d8c01dc4/resourceGroups/myRG/providers/Microsoft.ContainerService/managedClusters/myCluster/providers/Microsoft.Chaos/targets/Microsoft-AzureKubernetesServiceChaosMesh/capabilities/PodChaos-2.1?api-version=2021-09-15-preview" --body "{\"properties\":{}}" |
| 83 | + ``` |
| 84 | +
|
| 85 | + This must be done for each capability you want to enable on the cluster. |
| 86 | +
|
| 87 | +You have now successfully onboarded your AKS cluster to Chaos Studio. |
| 88 | +
|
| 89 | +## Create an experiment |
| 90 | +With your AKS cluster now onboarded, you can create your experiment. A chaos experiment defines the actions you want to take against target resources, organized into steps, which run sequentially, and branches, which run in parallel. |
| 91 | +
|
| 92 | +1. Create a Chaos Mesh jsonSpec: |
| 93 | + 1. Visit the Chaos Mesh documentation for a fault type, [for example, the PodChaos type](https://chaos-mesh.org/docs/simulate-pod-chaos-on-kubernetes/#create-experiments-using-yaml-configuration-files). |
| 94 | + 2. Formulate the YAML configuration for that fault type using the Chaos Mesh documentation. |
| 95 | +
|
| 96 | + ```yaml |
| 97 | + apiVersion: chaos-mesh.org/v1alpha1 |
| 98 | + kind: PodChaos |
| 99 | + metadata: |
| 100 | + name: pod-failure-example |
| 101 | + namespace: chaos-testing |
| 102 | + spec: |
| 103 | + action: pod-failure |
| 104 | + mode: all |
| 105 | + duration: '600s' |
| 106 | + selector: |
| 107 | + namespaces: |
| 108 | + - default |
| 109 | + ``` |
| 110 | + 3. Remove any YAML outside of the `spec` (including the spec property name), and remove the indentation of the spec details. |
| 111 | +
|
| 112 | + ```yaml |
| 113 | + action: pod-failure |
| 114 | + mode: all |
| 115 | + duration: '600s' |
| 116 | + selector: |
| 117 | + namespaces: |
| 118 | + - default |
| 119 | + ``` |
| 120 | + 4. Use a [YAML-to-JSON converter like this one](https://www.convertjson.com/yaml-to-json.htm) to convert the Chaos Mesh YAML to JSON and minimize it. |
| 121 | +
|
| 122 | + ```json |
| 123 | + {"action":"pod-failure","mode":"all","duration":"600s","selector":{"namespaces":["default"]}} |
| 124 | + ``` |
| 125 | + 5. Use a [JSON string escape tool like this one](https://www.freeformatter.com/json-escape.html) to escape the JSON spec. |
| 126 | + |
| 127 | + ```json |
| 128 | + {\"action\":\"pod-failure\",\"mode\":\"all\",\"duration\":\"600s\",\"selector\":{\"namespaces\":[\"default\"]}} |
| 129 | + ``` |
| 130 | +
|
| 131 | +2. Create your experiment JSON starting with the JSON sample below. Modify the JSON to correspond to the experiment you want to run using the [Create Experiment API](/rest/api/chaosstudio/experiments/create-or-update), the [fault library](chaos-studio-fault-library.md), and the jsonSpec created in the previous step. |
| 132 | +
|
| 133 | + ```json |
| 134 | + { |
| 135 | + "location": "centralus", |
| 136 | + "identity": { |
| 137 | + "type": "SystemAssigned" |
| 138 | + }, |
| 139 | + "properties": { |
| 140 | + "steps": [ |
| 141 | + { |
| 142 | + "name": "AKS pod kill", |
| 143 | + "branches": [ |
| 144 | + { |
| 145 | + "name": "AKS pod kill", |
| 146 | + "actions": [ |
| 147 | + { |
| 148 | + "type": "continuous", |
| 149 | + "selectorId": "Selector1", |
| 150 | + "duration": "PT10M", |
| 151 | + "parameters": [ |
| 152 | + { |
| 153 | + "key": "jsonSpec", |
| 154 | + "value": "{\"action\":\"pod-failure\",\"mode\":\"all\",\"duration\":\"600s\",\"selector\":{\"namespaces\":[\"default\"]}}" |
| 155 | + } |
| 156 | + ], |
| 157 | + "name": "urn:csci:microsoft:azureKubernetesServiceChaosMesh:podChaos/2.1" |
| 158 | + } |
| 159 | + ] |
| 160 | + } |
| 161 | + ] |
| 162 | + } |
| 163 | + ], |
| 164 | + "selectors": [ |
| 165 | + { |
| 166 | + "id": "Selector1", |
| 167 | + "type": "List", |
| 168 | + "targets": [ |
| 169 | + { |
| 170 | + "type": "ChaosTarget", |
| 171 | + "id": "/subscriptions/b65f2fec-d6b2-4edd-817e-9339d8c01dc4/resourceGroups/myRG/providers/Microsoft.ContainerService/managedClusters/myCluster/providers/Microsoft.Chaos/targets/Microsoft-AzureKubernetesServiceChaosMesh" |
| 172 | + } |
| 173 | + ] |
| 174 | + } |
| 175 | + ] |
| 176 | + } |
| 177 | + } |
| 178 | + ``` |
| 179 | + |
| 180 | +2. Create the experiment using the Azure CLI, replacing `$SUBSCRIPTION_ID`, `$RESOURCE_GROUP`, and `$EXPERIMENT_NAME` with the properties for your experiment. Make sure you have saved and uploaded your experiment JSON and update `experiment.json` with your JSON filename. |
| 181 | +
|
| 182 | + ```azurecli-interactive |
| 183 | + az rest --method put --uri https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Chaos/experiments/$EXPERIMENT_NAME?api-version=2021-09-15-preview --body @experiment.json |
| 184 | + ``` |
| 185 | +
|
| 186 | + Each experiment creates a corresponding system-assigned managed identity. Note of the `principalId` for this identity in the response for the next step. |
| 187 | +
|
| 188 | +## Give experiment permission to your AKS cluster |
| 189 | +When you create a chaos experiment, Chaos Studio creates a system-assigned managed identity that executes faults against your target resources. This identity must be given [appropriate permissions](chaos-studio-fault-providers.md) to the target resource for the experiment to run successfully. |
| 190 | +
|
| 191 | +Give the experiment access to your resource(s) using the command below, replacing `$EXPERIMENT_PRINCIPAL_ID` with the principalId from the previous step and `$RESOURCE_ID` with the resource ID of the target resource (in this case, the AKS cluster resource ID). Run this command for each resource targeted in your experiment. |
| 192 | +
|
| 193 | +```azurecli-interactive |
| 194 | +az role assignment create --role "Azure Kubernetes Cluster User Role" --assignee-object-id $EXPERIMENT_PRINCIPAL_ID --scope $RESOURCE_ID |
| 195 | +``` |
| 196 | + |
| 197 | +## Run your experiment |
| 198 | +You are now ready to run your experiment. To see the impact, we recommend opening your AKS cluster overview and going to **Insights** in a separate browser tab. Live data for the **Active Pod Count** will show the impact of running your experiment. |
| 199 | + |
| 200 | +1. Start the experiment using the Azure CLI, replacing `$SUBSCRIPTION_ID`, `$RESOURCE_GROUP`, and `$EXPERIMENT_NAME` with the properties for your experiment. |
| 201 | + |
| 202 | + ```azurecli-interactive |
| 203 | + az rest --method post --uri https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.Chaos/experiments/$EXPERIMENT_NAME/start?api-version=2021-09-15-preview |
| 204 | + ``` |
| 205 | +
|
| 206 | +2. The response includes a status URL that you can use to query experiment status as the experiment runs. |
| 207 | +
|
| 208 | +## Next steps |
| 209 | +Now that you have run an AKS Chaos Mesh service-direct experiment, you are ready to: |
| 210 | +- [Create an experiment that uses agent-based faults](chaos-studio-tutorial-agent-based-portal.md) |
| 211 | +- [Manage your experiment](chaos-studio-run-experiment.md) |
0 commit comments