|
| 1 | +# Feature Toggles |
| 2 | + |
| 3 | +Feature toggles allow us to deploy code to production in a disabled state, enabling it later without a new deployment. |
| 4 | + |
| 5 | +## How It Works |
| 6 | + |
| 7 | +Our feature toggle system is built on **AWS Systems Manager (SSM) Parameter Store**. |
| 8 | + |
| 9 | +1. **Single Source of Truth**: AWS SSM is the single source of truth for the current state (`true` or `false`) of all feature toggles. |
| 10 | +2. **Infrastructure as Code**: Toggles are defined in Terraform, ensuring configuration is version-controlled and repeatable across environments. |
| 11 | +3. **CI/CD Validation**: The `feature_toggle.json` file in the repository lists all toggles the application requires. The CI/CD pipeline checks that every toggle in this file exists in AWS SSM before a deployment can proceed. |
| 12 | +4. **Runtime Caching**: The application code uses a cached `is_feature_enabled()` function to check a toggle's state at runtime, minimizing calls to AWS and ensuring high performance. |
| 13 | + |
| 14 | +## Developer Workflow |
| 15 | + |
| 16 | +### Step 1: Define the Toggle (The Single Source of Truth) |
| 17 | + |
| 18 | +Adding a new toggle is a single-step process. You only need to add a new entry to the `feature_toggle.json` file. This file defines the toggle's metadata and its intended state for each environment. |
| 19 | + |
| 20 | +`default_state`: The safe, production-like state (usually `false`). |
| 21 | + |
| 22 | +`env_overrides`: An optional map to set a different state for specific environments (e.g., enabling the feature in `dev` and `test` for QA). If an environment is not listed, it uses the `default_state`. |
| 23 | + |
| 24 | +**File: [feature_toggle.json](../../../scripts/feature_toggle/feature_toggle.json)** |
| 25 | + |
| 26 | +```json |
| 27 | +{ |
| 28 | + "enable_dynamic_status_text": { |
| 29 | + "purpose": "Enables dynamic status text based on conditions.", |
| 30 | + "ticket": "ELI-427", |
| 31 | + "created": "2025-09-02", |
| 32 | + "default_state": false, |
| 33 | + "env_overrides": { |
| 34 | + "dev": true, |
| 35 | + "test": true |
| 36 | + } |
| 37 | + } |
| 38 | +} |
| 39 | +``` |
| 40 | + |
| 41 | +Our Terraform setup automatically reads this file and creates the corresponding SSM parameters. You do not need to write new Terraform code for each toggle. |
| 42 | + |
| 43 | +**File: [ssm.tf](../../../infrastructure/stacks/api-layer/ssm.tf) (For Reference—No edits needed)** |
| 44 | + |
| 45 | +```terraform |
| 46 | +resource "aws_ssm_parameter" "feature_toggles" { |
| 47 | + for_each = jsondecode(file("${path.root}/scripts/feature_toggle/feature_toggle.json")) |
| 48 | +
|
| 49 | + name = "/${var.environment}/feature_toggles/${each.key}" |
| 50 | + type = "String" |
| 51 | +
|
| 52 | + value = lookup(each.value.env_overrides, var.environment, each.value.default_state) |
| 53 | +
|
| 54 | + tags = { |
| 55 | + Environment = var.environment |
| 56 | + ManagedBy = "terraform" |
| 57 | + Purpose = each.value.purpose |
| 58 | + Ticket = each.value.ticket |
| 59 | + Created = each.value.created |
| 60 | + } |
| 61 | +
|
| 62 | + lifecycle { |
| 63 | + ignore_changes = [value] |
| 64 | + } |
| 65 | +} |
| 66 | +``` |
| 67 | + |
| 68 | +### Step 2: Implement and Test the Logic |
| 69 | + |
| 70 | +Import and use the `is_feature_enabled()` function to create a conditional code path. |
| 71 | + |
| 72 | +**File (Example): `eligibility_signposting_api/services/calculators/eligibility_calculator.py`** |
| 73 | + |
| 74 | +```python |
| 75 | +from eligibility_signposting_api.feature_toggle.feature_toggle import is_feature_enabled |
| 76 | + |
| 77 | +if is_feature_enabled("enable_dynamic_status_text"): |
| 78 | + # New feature logic |
| 79 | + status_text = self.get_status_text(active_iteration.status_text, ConditionName(cc.target), status) |
| 80 | +else: |
| 81 | + # Existing (old) logic |
| 82 | + status_text = status.get_default_status_text(ConditionName(cc.target)) |
| 83 | +``` |
| 84 | + |
| 85 | +You must write unit tests that cover both the "on" and "off" states of the toggle. Use `pytest.mark.parametrize` to run the same test with both states and `unittest.mock.patch` to control the toggle's return value. |
| 86 | + |
| 87 | +**Important**: The patch path must point to **where the function is used**, not where it is defined. |
| 88 | + |
| 89 | +**File (Example): `tests/unit/services/calculators/test_eligibility_calculator.py`** |
| 90 | + |
| 91 | +```python |
| 92 | +import pytest |
| 93 | +from unittest.mock import patch |
| 94 | + |
| 95 | +@pytest.mark.parametrize( |
| 96 | + "enable_dynamic_status_text, expected_rsv_text", |
| 97 | + [ |
| 98 | + (True, "You are not eligible to take RSV vaccine"), # Case 1: Toggle is ON |
| 99 | + (False, "We do not believe you can have it"), # Case 2: Toggle is OFF |
| 100 | + ], |
| 101 | +) |
| 102 | +@patch("eligibility_signposting_api.services.calculators.eligibility_calculator.is_feature_enabled") |
| 103 | +def test_status_text_is_conditional_on_toggle( |
| 104 | + mock_is_feature_enabled, |
| 105 | + enable_dynamic_status_text, |
| 106 | + expected_rsv_text, |
| 107 | + faker: Faker |
| 108 | +): |
| 109 | + |
| 110 | + # This mock controls the toggle for the test run |
| 111 | + mock_is_feature_enabled.return_value = enable_dynamic_status_text |
| 112 | + |
| 113 | + # Given, When, Then... |
| 114 | + assert actual_text_from_audit == expected_rsv_text |
| 115 | +``` |
| 116 | + |
| 117 | +### Step 3: Commit and Deploy (The Automation) |
| 118 | + |
| 119 | +1. Terraform Apply: During the infrastructure step of deployment, Terraform executes the ssm.tf configuration. It reads the updated feature_toggle.json file. |
| 120 | +2. Creation: Because of the 'for each' loop, Terraform detects the new feature toggle entry. It then automatically runs the aws_ssm_parameter resource block for this new item, creating the parameter in AWS SSM with the correct name (e.g., /Dev/feature_toggles/enable_dynamic_status_text) and the appropriate initial value based on the environment (true for Dev and Test, false for others). |
| 121 | +3. Validation: Immediately after the validate_toggles.py script runs. It reads the same JSON file, sees that the feature toggle is required, and queries AWS SSM to confirm that Terraform successfully created it. |
| 122 | + |
| 123 | +### Step 4: Cleanup Process |
| 124 | + |
| 125 | +Feature toggles are **technical debt**. Once a feature is fully released and stable, the toggle and all associated conditional logic must be removed. |
| 126 | + |
| 127 | +Follow the **"Two-Ticket" Rule**: |
| 128 | + |
| 129 | +1. When you create a ticket to add a feature toggle, immediately create a second ticket to remove it. |
| 130 | +2. Link the two tickets. |
| 131 | +3. Once the feature is permanently enabled, schedule the cleanup ticket in an upcoming sprint to remove the toggle from: |
| 132 | + - The application code |
| 133 | + - All related test code |
| 134 | + - The `feature_toggle.json` file |
0 commit comments