|
| 1 | +# Adding a New Scenario Test (CI/tests_v2) |
| 2 | + |
| 3 | +This guide explains how to add a new chaos scenario test to the v2 pytest framework. The layout is **folder-per-scenario**: each scenario has its own directory under `scenarios/<scenario_name>/` containing the test file, Kubernetes resources, and the Krkn scenario base YAML. |
| 4 | + |
| 5 | +## Option 1: Scaffold script (recommended) |
| 6 | + |
| 7 | +From the **repository root**: |
| 8 | + |
| 9 | +```bash |
| 10 | +python CI/tests_v2/scaffold.py --scenario service_hijacking |
| 11 | +``` |
| 12 | + |
| 13 | +This creates: |
| 14 | + |
| 15 | +- `CI/tests_v2/scenarios/service_hijacking/test_service_hijacking.py` — A test class extending `BaseScenarioTest` with a stub `test_happy_path` and `WORKLOAD_MANIFEST` pointing to the folder’s `resource.yaml`. |
| 16 | +- `CI/tests_v2/scenarios/service_hijacking/resource.yaml` — A placeholder Deployment (namespace is patched at deploy time). |
| 17 | +- `CI/tests_v2/scenarios/service_hijacking/scenario_base.yaml` — A placeholder Krkn scenario; edit this with the structure expected by your scenario type. |
| 18 | + |
| 19 | +The script automatically registers the marker in `CI/tests_v2/pytest.ini`. For example, it adds: |
| 20 | + |
| 21 | +``` |
| 22 | +service_hijacking: marks a test as a service_hijacking scenario test |
| 23 | +``` |
| 24 | + |
| 25 | +**Next steps after scaffolding:** |
| 26 | + |
| 27 | +1. Verify the marker was added to `pytest.ini` (the scaffold does this automatically). |
| 28 | +2. Edit `scenario_base.yaml` with the structure your Krkn scenario type expects (see `scenarios/application_outage/scenario_base.yaml` and `scenarios/pod_disruption/scenario_base.yaml` for examples). The top-level key should match `SCENARIO_NAME`. |
| 29 | +3. If your scenario uses a **list** structure (like pod_disruption) instead of a **dict** with a top-level key, set `NAMESPACE_KEY_PATH` (e.g. `[0, "config", "namespace_pattern"]`) and `NAMESPACE_IS_REGEX = True` if the namespace is a regex pattern. |
| 30 | +4. The generated `test_happy_path` already uses `self.run_scenario(self.tmp_path, ns)` and assertions. Add more test methods (e.g. negative tests with `@pytest.mark.no_workload`) as needed. |
| 31 | +5. Adjust `resource.yaml` if your scenario needs a different workload (e.g. specific image or labels). |
| 32 | + |
| 33 | +If your Kraken scenario type string is not `<scenario>_scenarios`, pass it explicitly: |
| 34 | + |
| 35 | +```bash |
| 36 | +python CI/tests_v2/scaffold.py --scenario node_disruption --scenario-type node_scenarios |
| 37 | +``` |
| 38 | + |
| 39 | +## Option 2: Manual setup |
| 40 | + |
| 41 | +1. **Create the scenario folder** |
| 42 | + `CI/tests_v2/scenarios/<scenario_name>/`. |
| 43 | + |
| 44 | +2. **Add resource.yaml** |
| 45 | + Kubernetes manifest(s) for the workload (Deployment or Pod). Use a distinct label (e.g. `app: <scenario>-target`). Omit or leave `metadata.namespace`; the framework patches it at deploy time. |
| 46 | + |
| 47 | +3. **Add scenario_base.yaml** |
| 48 | + The canonical Krkn scenario structure. Tests will load this, patch namespace (and any overrides), write to `tmp_path`, and pass to `build_config`. See existing scenarios for the format your scenario type expects. |
| 49 | + |
| 50 | +4. **Add test_<scenario>.py** |
| 51 | + - Import `BaseScenarioTest` from `lib.base` and helpers from `lib.utils` (e.g. `assert_kraken_success`, `get_pods_list`, `scenario_dir` if needed). |
| 52 | + - Define a class extending `BaseScenarioTest` with: |
| 53 | + - `WORKLOAD_MANIFEST = "CI/tests_v2/scenarios/<scenario_name>/resource.yaml"` |
| 54 | + - `WORKLOAD_IS_PATH = True` |
| 55 | + - `LABEL_SELECTOR = "app=<label>"` |
| 56 | + - `SCENARIO_NAME = "<scenario_name>"` |
| 57 | + - `SCENARIO_TYPE = "<scenario_type>"` (e.g. `application_outages_scenarios`) |
| 58 | + - `NAMESPACE_KEY_PATH`: path to the namespace field (e.g. `["application_outage", "namespace"]` for dict-based, or `[0, "config", "namespace_pattern"]` for list-based) |
| 59 | + - `NAMESPACE_IS_REGEX = False` (or `True` for regex patterns like pod_disruption) |
| 60 | + - `OVERRIDES_KEY_PATH = ["<top-level key>"]` if the scenario supports overrides (e.g. duration, block). |
| 61 | + - Add `@pytest.mark.functional` and `@pytest.mark.<scenario>` on the class. |
| 62 | + - In at least one test, call `self.run_scenario(self.tmp_path, self.ns)` and assert with `assert_kraken_success`, `assert_pod_count_unchanged`, and `assert_all_pods_running_and_ready`. Use `self.k8s_core`, `self.tmp_path`, etc. (injected by the base class). |
| 63 | + |
| 64 | +5. **Register the marker** |
| 65 | + In `CI/tests_v2/pytest.ini`, under `markers`: |
| 66 | + ``` |
| 67 | + <scenario>: marks a test as a <scenario> scenario test |
| 68 | + ``` |
| 69 | + |
| 70 | +## Conventions |
| 71 | + |
| 72 | +- **Folder-per-scenario**: One directory per scenario under `scenarios/`. All assets (test, resource.yaml, scenario_base.yaml, and any extra YAMLs) live there for easy tracking and onboarding. |
| 73 | +- **Ephemeral namespace**: Every test gets a unique `krkn-test-<uuid>` namespace. The base class deploys the workload into it before the test; no manual deploy is required. |
| 74 | +- **Negative tests**: For tests that don’t need a workload (e.g. invalid scenario, bad namespace), use `@pytest.mark.no_workload`. The test will still get a namespace but no workload will be deployed. |
| 75 | +- **Scenario type**: `SCENARIO_TYPE` must match the key in Kraken’s config (e.g. `application_outages_scenarios`, `pod_disruption_scenarios`). See `CI/tests_v2/config/common_test_config.yaml` and the scenario plugin’s `get_scenario_types()`. |
| 76 | +- **Assertions**: Use `assert_kraken_success(result, context=f"namespace={ns}", tmp_path=self.tmp_path)` so failures include stdout/stderr and optional log files. |
| 77 | +- **Timeouts**: Use constants from `lib.base` (`READINESS_TIMEOUT`, `POLICY_WAIT_TIMEOUT`, etc.) instead of magic numbers. |
| 78 | + |
| 79 | +## Exit Code Handling |
| 80 | + |
| 81 | +Kraken uses the following exit codes: **0** = success; **1** = scenario failure (e.g. post scenarios still failing); **2** = critical alerts fired; **3+** = health check / KubeVirt check failures; **-1** = infrastructure error (bad config, no kubeconfig). |
| 82 | + |
| 83 | +- **Happy-path tests**: Use `assert_kraken_success(result, ...)`. By default only exit code 0 is accepted. |
| 84 | +- **Alert-aware tests**: If you enable `check_critical_alerts` and expect alerts, use `assert_kraken_success(result, allowed_codes=(0, 2), ...)` so exit code 2 is treated as acceptable. |
| 85 | +- **Expected-failure tests**: Use `assert_kraken_failure(result, context=..., tmp_path=self.tmp_path)` for negative tests (invalid scenario, bad namespace, etc.). This gives the same diagnostic quality (log dump, tmp_path hint) as success assertions. Prefer this over a bare `assert result.returncode != 0`. |
| 86 | + |
| 87 | +## Running your new tests |
| 88 | + |
| 89 | +```bash |
| 90 | +pytest CI/tests_v2/ -v -m <scenario> |
| 91 | +``` |
| 92 | + |
| 93 | +For debugging with logs and keeping failed namespaces: |
| 94 | + |
| 95 | +```bash |
| 96 | +pytest CI/tests_v2/ -v -m <scenario> --log-cli-level=DEBUG --keep-ns-on-fail |
| 97 | +``` |
| 98 | + |
| 99 | +--- |
| 100 | + |
| 101 | +## Naming Conventions |
| 102 | + |
| 103 | +Follow these conventions so the framework stays consistent as new scenarios are added. |
| 104 | + |
| 105 | +### Quick Reference |
| 106 | + |
| 107 | +| Element | Pattern | Example | |
| 108 | +|---|---|---| |
| 109 | +| Scenario folder | `scenarios/<snake_case>/` | `scenarios/node_disruption/` | |
| 110 | +| Test file | `test_<scenario>.py` | `test_node_disruption.py` | |
| 111 | +| Test class | `Test<CamelCase>(BaseScenarioTest)` | `TestNodeDisruption` | |
| 112 | +| Pytest marker | `@pytest.mark.<scenario>` (matches folder) | `@pytest.mark.node_disruption` | |
| 113 | +| Scenario YAML | `scenario_base.yaml` | — | |
| 114 | +| Workload YAML | `resource.yaml` | — | |
| 115 | +| Extra YAMLs | `<descriptive_name>.yaml` | `nginx_http.yaml` | |
| 116 | +| Lib modules | `lib/<concern>.py` | `lib/deploy.py` | |
| 117 | +| Public fixtures | `<verb>_<noun>` or `<noun>` | `run_kraken`, `test_namespace` | |
| 118 | +| Private/autouse fixtures | `_<descriptive>` | `_cleanup_stale_namespaces` | |
| 119 | +| Assertion helpers | `assert_<condition>` | `assert_pod_count_unchanged` | |
| 120 | +| Query helpers | `get_<resource>` or `find_<resource>_by_<criteria>` | `get_pods_list`, `find_network_policy_by_prefix` | |
| 121 | +| Env var overrides | `KRKN_TEST_<NAME>` | `KRKN_TEST_READINESS_TIMEOUT` | |
| 122 | + |
| 123 | +### Folders |
| 124 | + |
| 125 | +- One folder per scenario under `scenarios/`. The folder name is `snake_case` and must match the `SCENARIO_NAME` class attribute in the test. |
| 126 | +- Shared framework code lives in `lib/`. Each module covers a single concern (`k8s`, `namespace`, `deploy`, `kraken`, `utils`, `base`, `preflight`). |
| 127 | +- Do **not** add scenario-specific code to `lib/`; keep it in the scenario folder as module-level helpers. |
| 128 | + |
| 129 | +### Files |
| 130 | + |
| 131 | +- Test files: `test_<scenario>.py`. This is required for pytest discovery (`test_*.py`). |
| 132 | +- Workload manifests: always `resource.yaml`. If a scenario needs additional K8s resources (e.g. a Service for traffic testing), use a descriptive name like `nginx_http.yaml`. |
| 133 | +- Scenario config: always `scenario_base.yaml`. This is the template that `load_and_patch_scenario` loads and patches. |
| 134 | + |
| 135 | +### Classes |
| 136 | + |
| 137 | +- One test class per file: `Test<CamelCase>` extending `BaseScenarioTest`. |
| 138 | +- The CamelCase name must be the PascalCase equivalent of the folder name (e.g. `pod_disruption` -> `TestPodDisruption`). |
| 139 | + |
| 140 | +### Test Methods |
| 141 | + |
| 142 | +- Prefix: `test_` (pytest requirement). |
| 143 | +- Use descriptive names that convey **what is being verified**, not implementation details. |
| 144 | +- Good: `test_pod_crash_and_recovery`, `test_traffic_blocked_during_outage`, `test_invalid_scenario_fails`. |
| 145 | +- Avoid: `test_run_1`, `test_scenario`, `test_it_works`. |
| 146 | + |
| 147 | +### Fixtures |
| 148 | + |
| 149 | +- **Public fixtures** (intended for use in tests): use `<verb>_<noun>` or plain `<noun>`. Examples: `run_kraken`, `deploy_workload`, `test_namespace`, `kubectl`. |
| 150 | +- **Private/autouse fixtures** (framework internals): prefix with `_`. Examples: `_kube_config_loaded`, `_preflight_checks`, `_inject_common_fixtures`. |
| 151 | +- K8s client fixtures use the `k8s_` prefix: `k8s_core`, `k8s_apps`, `k8s_networking`, `k8s_client`. |
| 152 | + |
| 153 | +### Helpers and Utilities |
| 154 | + |
| 155 | +- **Assertions**: `assert_<what_is_expected>`. Always raise `AssertionError` with a message that includes the namespace. |
| 156 | +- **K8s queries**: `get_<resource>_list` for direct API calls, `find_<resource>_by_<criteria>` for filtered lookups. |
| 157 | +- **Private helpers**: prefix with `_` for module-internal functions (e.g. `_pods`, `_policies`, `_get_nested`). |
| 158 | + |
| 159 | +### Constants and Environment Variables |
| 160 | + |
| 161 | +- Timeout constants: `UPPER_CASE` in `lib/base.py`. Each is overridable via an env var prefixed `KRKN_TEST_`. |
| 162 | +- Feature flags: `KRKN_TEST_DRY_RUN`, `KRKN_TEST_COVERAGE`. Always use the `KRKN_TEST_` prefix so all tunables are discoverable with `grep KRKN_TEST_`. |
| 163 | + |
| 164 | +### Markers |
| 165 | + |
| 166 | +- Every test class gets `@pytest.mark.functional` (framework-wide) and `@pytest.mark.<scenario>` (scenario-specific). |
| 167 | +- The scenario marker name matches the folder name exactly. |
| 168 | +- Behavioral modifiers use plain descriptive names: `no_workload`, `order`. |
| 169 | +- Register all custom markers in `pytest.ini` to avoid warnings. |
| 170 | + |
| 171 | +## Adding Dependencies |
| 172 | + |
| 173 | +- **Runtime (Kraken needs it)**: Add to the **root** `requirements.txt`. Pin a version (e.g. `package==1.2.3` or `package>=1.2,<2`). |
| 174 | +- **Test-only (only CI/tests_v2 needs it)**: Add to **`CI/tests_v2/requirements.txt`**. Pin a version there as well. |
| 175 | +- After changing either file, run `make setup` (or `make -f CI/tests_v2/Makefile setup`) from the repo root to verify both files install cleanly together. |
0 commit comments