Skip to content

komodorio/failure-scenarios

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

3 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Bank of Anthos Failure Scenarios Failure Injection System

License Kubernetes

A comprehensive Kubernetes failure injection and testing framework for Bank of Anthos, enabling automated chaos engineering and resilience testing across isolated namespaces.

๐Ÿš€ Quick Start

Get up and running in under 5 minutes:

# Clone the repository
git clone git@github.com:komodorio/failure-scenarios.git
cd failure-scenarios

# Launch the interactive menu
./start.sh

Or run individual commands:

# Deploy all scenarios to isolated namespaces (~3 min)
./batch/setup-all-scenarios.sh

# Check deployment status
./batch/check-all-scenarios.sh

# Inject all failures in parallel (~1-2 min)
./batch/inject-all-scenarios.sh

# Cleanup all namespaces
./batch/cleanup-all-scenarios.sh

โœจ Features

  • ๐ŸŽฏ 11 Failure Scenarios - Database locks, OOM kills, policy violations, CronJob failures, and more
  • โšก Parallel Execution - Deploy 11 namespaces in parallel (~3 minutes)
  • ๐Ÿ”’ Isolated Testing - Each scenario runs in its own namespace
  • ๐Ÿ”„ Dynamic Discovery - Add new scenarios with zero code changes
  • ๐Ÿ“Š Multiple Interfaces - Interactive menus, individual commands, or batch operations
  • ๐Ÿ› ๏ธ Self-Contained Scenarios - Each scenario can inject and revert its own failures independently
  • โ™ป๏ธ Restore & Cleanup - Built-in restoration scripts to return to normal state

๐Ÿ“‹ Prerequisites

Before running this project, ensure you have:

Tool Minimum Version Purpose
Kubernetes cluster 1.19+ Target environment for failure injection
kubectl 1.30+ Kubernetes CLI tool
helm 3.0+ Package manager for Kubernetes
jq 1.6+ JSON processor for parsing
bash 4.0+ Shell interpreter

Cluster Resources

  • Minimum: 5 CPU cores, 5GB RAM, 50GB storage

Installation

# macOS
brew install kubectl helm jq

# Ubuntu/Debian
sudo apt-get install kubectl helm jq

# Check versions
kubectl version --client
helm version --short
jq --version

๐Ÿ“– Available Failure Scenarios

For a full summary of all available failure scenarios, see the Scenarios Overview Table.

๐ŸŽฎ Usage

Interactive Menu (Recommended)

./start.sh

Provides a menu-driven interface with options:

  1. Setup All Scenarios
  2. Check All Scenarios
  3. Inject All Scenarios
  4. Inject by Namespace (interactive)
  5. Restore All Scenarios
  6. Restore by Namespace (interactive)
  7. Cleanup All Scenarios

Individual Scenario Testing

Each scenario is self-contained with inject and revert capabilities:

# Inject failure (default action)
./scenarios/bad-deployment-scenario.sh
./scenarios/bad-deployment-scenario.sh inject

# Revert failure
./scenarios/bad-deployment-scenario.sh revert

# With specific namespace
NAMESPACE="bad-deployment-scenario" ./scenarios/bad-deployment-scenario.sh inject
NAMESPACE="bad-deployment-scenario" ./scenarios/bad-deployment-scenario.sh revert

# Check status
kubectl get pods -n bad-deployment-scenario -w

Batch Operations

# Setup all scenarios in parallel
./batch/setup-all-scenarios.sh

# Inject all failures in parallel
./batch/inject-all-scenarios.sh

# Restore all scenarios
./batch/restore-all-scenarios.sh

# Check status of all scenarios
./batch/check-all-scenarios.sh

๐Ÿ“š Documentation

๐Ÿ—๏ธ Architecture

This system follows a modular, extensible architecture:

โ”œโ”€โ”€ batch/                    # Batch operations for all scenarios
โ”œโ”€โ”€ scenarios/                # Individual failure injection scripts
โ”œโ”€โ”€ lib/                      # Shared utilities and helpers
โ”œโ”€โ”€ config/                   # Configuration files
โ”‚   โ””โ”€โ”€ namespace-mapping.conf  # Maps scenarios to custom namespaces
โ””โ”€โ”€ bank-of-anthos/          # Google Cloud sample application

Key Design Patterns

  1. Dynamic Scenario Discovery - Automatically discovers *-scenario.sh files
  2. Custom Namespace Mapping - Scenarios deploy to creative movie-themed namespaces
  3. Namespace Override - Same scripts work for user and dedicated namespaces
  4. Parallel Execution - Background processes with PID tracking
  5. Shared Libraries - DRY principle with centralized utilities

Namespace Naming

Each scenario deploys to a custom namespace inspired by fictional cities from movies:

  • bad-deployment โ†’ bank-of-springfield (The Simpsons Movie)
  • database-lock โ†’ bank-of-punxsutawney (Groundhog Day)
  • high-load โ†’ bank-of-seahaven (The Truman Show)
  • And more! See config/namespace-mapping.conf for the complete list.

All namespaces include a timestamp suffix for isolation: bank-of-springfield-{timestamp}

Adding a New Scenario

  1. Create scenarios/your-failure-scenario.sh
  2. Follow the *-scenario.sh naming convention
  3. Implement both inject_failure() and revert_failure() functions
  4. Use the template from scenarios/README.md
  5. Test both actions:
    ./scenarios/your-failure-scenario.sh inject
    ./scenarios/your-failure-scenario.sh revert
  6. Submit a pull request

No other code changes needed - scenarios are auto-discovered and automatically integrated!

๐Ÿ“Š Example Output

$ ./batch/setup-all-scenarios.sh

========================================
  Multi-Namespace Scenario Setup
========================================

Deploying Bank of Anthos to 14 scenario namespaces in parallel...

โœ“ bad-deployment-scenario deployed (12.3s)
โœ“ config-misconfigured-scenario deployed (11.8s)
โœ“ database-lock-scenario deployed (13.1s)
โœ“ failed-backup-cronjob-scenario deployed (12.0s)
โœ“ helm-bad-upgrade-scenario deployed (12.7s)
โœ“ high-load-scenario deployed (11.5s)
โœ“ kyverno-policy-scenario deployed (14.2s)
โœ“ limit-range-contacts-scenario deployed (12.9s)
โœ“ missing-storage-class-scenario deployed (12.4s)
โœ“ network-policy-scenario deployed (12.1s)
โœ“ node-selector-scenario deployed (11.9s)
โœ“ oom-killed-scenario deployed (13.4s)
โœ“ resource-quota-scenario deployed (12.2s)
โœ“ wrong-sa-scenario deployed (11.7s)

[SUCCESS] All 14 scenarios deployed successfully!
Total time: 3m 18s

โฑ๏ธ Performance Metrics

Operation Time Notes
Setup all scenarios ~3 min 10+ namespaces in parallel (depends on the number of scenarios)
Inject all scenarios ~1-2 min Parallel failure injection
Check all scenarios ~10 sec Quick status overview
Cleanup all scenarios ~2 min Parallel namespace deletion

๐Ÿ”’ Security

This tool intentionally causes failures in Kubernetes clusters. Please review our Security Policy for:

  • Vulnerability reporting
  • Best practices
  • Known security considerations

โš ๏ธ Important: Only use on dedicated test clusters. Never run on production.

๐Ÿ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

๐Ÿ“ž Support


Made with โค๏ธ by Komodor

About

No description, website, or topics provided.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors