Commit 3047b8a
committed
test: enhance smoke test suite reliability and add comprehensive checks
Major improvements to the ClickHouse Helm chart test suite addressing
timing issues, adding new test scenarios, and improving robustness.
## Timing and Reliability Fixes
- Increase timeout for system.clusters verification (60s → 180s)
Allows proper cluster topology propagation after deployment
- Add retry logic for ON CLUSTER queries during upgrades
Wait for cluster config propagation before distributed operations
- Add service endpoints verification with retry (60s timeout)
Ensure all endpoints register after pod creation
- Fix PVC verification to only check mounted volumes
Prevents false failures from orphaned PVCs (reclaimPolicy: Retain)
Affects: verify_clickhouse_pvc_size, verify_log_persistence, verify_keeper_storage
## New Test Features
- Add data survival verification for upgrade scenarios
Creates test table before upgrade, verifies data after
Smart detection: only runs for in-place upgrades (same nameOverride)
Skips cluster replacement scenarios with informative note
- Add Keeper high availability chaos test
Deletes one Keeper pod from 3-node quorum
Verifies ClickHouse remains writable with 2/3 Keepers
Runs automatically for replicated deployments
- Add metrics endpoint verification
Tests operator metrics endpoint accessibility
Validates Prometheus format output
## Robustness Improvements
- Implement get_ready_clickhouse_pod() helper
Selects first Running pod instead of blindly using pods[0]
Eliminates race conditions during pod initialization
Applied to: verify_clickhouse_version, verify_user_connection, verify_extra_config_values
- Refactor get_cluster_topology() to use JSON parsing
Changed from FORMAT TabSeparated to FORMAT JSON
Adds proper error handling with json.JSONDecodeError
Type-safe field access with .get() methods
- Refactor verify_metrics_endpoint() to use operator pod
Access via localhost from operator pod (more reliable)
Implement curl/wget fallback for tool availability
Removes dependency on ClickHouse pod having network tools
- Add get_operator_pod() helper function
Finds operator pod (excludes CHI pods)
Used for metrics endpoint testing
## Warning Cleanup
- Fix 'level' setting warning in extraConfig parsing
Add 'level' to skip list (nested logger element)
- Improve secrets verification messaging
Change warning to informational for inline credentials
Acknowledges inline config is valid for smoke tests
- Remove non-existent upgrade scenario from test list
Cleanup UPGRADE_SCENARIOS list
## New Helper Functions
tests/steps/clickhouse.py:
- get_ready_clickhouse_pod() - Select first Running pod
- get_operator_pod() - Find operator pod
- create_test_data() - Create test data for upgrade verification
- verify_data_survival() - Verify data survived upgrade
- test_keeper_high_availability() - Chaos test for Keeper quorum
- verify_metrics_endpoint() - Test metrics endpoint (refactored)
tests/steps/kubernetes.py:
- delete_pod() - Delete a Kubernetes pod
## Files Modified
- tests/scenarios/smoke.py
- tests/steps/clickhouse.py
- tests/steps/deployment.py
- tests/steps/kubernetes.py
Fixes timing issues, adds production-ready test scenarios, and improves
overall test suite reliability and maintainability.1 parent 92b29dc commit 3047b8a
3 files changed
+461
-85
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
20 | | - | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
59 | 72 | | |
60 | 73 | | |
61 | 74 | | |
| |||
92 | 105 | | |
93 | 106 | | |
94 | 107 | | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
95 | 125 | | |
96 | 126 | | |
97 | 127 | | |
| |||
100 | 130 | | |
101 | 131 | | |
102 | 132 | | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
103 | 148 | | |
104 | 149 | | |
105 | 150 | | |
| |||
139 | 184 | | |
140 | 185 | | |
141 | 186 | | |
142 | | - | |
143 | | - | |
| 187 | + | |
| 188 | + | |
0 commit comments