This Python library simulates two-player strategic interactions with memory-based decision-making, exogenous shocks, and recovery dynamics. It is designed for research on norm formation, stability, and recovery following temporary disruptions to the payoff environment.
The project enables researchers to conduct large-scale Monte Carlo simulations across parameter spaces, analyze how agents' behavioral patterns respond to shocks, and measure recovery times to equilibrium. It is particularly useful for studying the robustness of social conventions and norms in dynamic game-theoretic settings.
- Two-Player Game Engine: Configurable N×M games with arbitrary payoff matrices for both players
- Memory-Based Behavior: Agents remember past joint actions and base decisions on that history
- Multiple Response Rules:
- Exhaustive best responses (condition on distinct actions observed in history)
- Expected payoff maximization (empirical frequency-based expectations)
- Epsilon-greedy stochastic actions for bounded rationality
- Shock Analysis:
- Temporary payoff matrix changes to represent exogenous disruptions
- Track pre-shock, post-shock, and recovery-phase frequencies
- Measure time-to-recovery for each action pair
- Initial State Modes:
- Random initial histories
- Balanced distributions
- Canonical (exhaustive enumeration) distributions
- Comprehensive Analysis & Visualization:
- Pre/post-shock frequency comparisons
- Recovery time distributions
- Heatmaps across memory length and noise parameters
- Recovery rate statistics
- Individual trajectory plotting
- Monte Carlo Sensitivity Analysis: Systematic parameter sweeps with parallel Monte Carlo runs
- Python 3.11+
- Conda (recommended) or pip
conda env create -f environment.yml
conda activate game-sim-envIf you prefer pip, install the required packages:
pip install numpy pandas matplotlib seaborn openpyxl.
├── README.md # This file
├── environment.yml # Conda environment specification
│
├── simulation.py # Core simulation engine
│ ├── simulate_two_player_game() # Basic game simulation
│ ├── simulate_with_shock() # Game simulation with payoff shocks
│ ├── run_sensitivity_with_shock() # Monte Carlo sensitivity analysis
│ └── compute_*_frequencies() # Recovery time calculations
│
├── response_functions.py # Decision-making strategies
│ ├── exhaustive_epsilon_policy_*() # Exhaustive best response policies
│ ├── expected_payoff_policy_*() # Expected payoff maximization policies
│ └── unified_response() # Policy selector wrapper
│
├── analysis.py # Result aggregation & reporting
│ ├── build_table_norm_epsilon_memory() # Master results table
│ ├── build_table_norm_shock_pair() # Shock outcome analysis
│ └── build_table_norm_initial_state_bin()# Initial state sensitivity
│
├── plotting.py # Visualization functions
│ ├── plot_pre_post_frequency_overall() # Summary bar chart
│ ├── plot_recovery_*_allpairs() # Recovery vs parameters
│ ├── plot_pre_shock_heatmap_grid() # Heatmaps (2×2 subplots)
│ ├── plot_individual_run() # Trajectory visualization
│ ├── heatmap_standard_deviation_grid() # Variability analysis
│ └── generate_all_outputs() # Full report generation
│
├── utils.py # Utility functions
│ ├── get_distinct_actions_from_history() # Extract observation set
│ ├── best_responses_*() # Best response calculation
│ ├── epsilon_greedy_choice() # Stochastic action selection
│ ├── generate_initial_history() # History initialization
│ ├── generate_all_distributions() # Canonical distribution enumeration
│ └── compute_initial_distribution() # Distribution parsing
│
└── examples.py # Example usage & parameter templates
from simulation import simulate_two_player_game
# Define a 2×2 game (Prisoner's Dilemma)
row_player_payoffs = [2, 0, 0, 1] # (C,C), (C,D), (D,C), (D,D)
col_player_payoffs = [1, 0, 0, 2]
# Run one trajectory
traj_df, freq_df = simulate_two_player_game(
num_rows=2,
num_cols=2,
memory_length=3,
timeperiod=100,
row_player_payoffs=row_player_payoffs,
column_player_payoffs=col_player_payoffs,
epsilon_row=0.05,
epsilon_col=0.05,
response_rule='exhaustive',
random_seed=42
)
print(traj_df.head()) # Time series of actions
print(freq_df) # Frequency of each joint actionfrom simulation import simulate_with_shock
# Same setup, but with a temporary payoff shock
shock_payoffs_row = [2, 1, 1, 1]
shock_payoffs_col = [1, 1, 1, 2]
traj_df, freq_df = simulate_with_shock(
num_rows=2,
num_cols=2,
memory_length=3,
timeperiod=100,
row_player_payoffs=row_player_payoffs,
column_player_payoffs=col_player_payoffs,
shock_payoff_row=shock_payoffs_row,
shock_payoff_col=shock_payoffs_col,
epsilon_row=0.05,
epsilon_col=0.05,
shock_time=48, # Shock starts at t=48
shock_duration=4, # Lasts 4 periods
response_rule='exhaustive',
random_seed=42
)from simulation import run_sensitivity, generate_all_outputs
from utils import generate_all_distributions
import pandas as pd
# Parameters
num_rows, num_cols = 2, 2
row_payoffs = [2, 0, 0, 1]
col_payoffs = [1, 0, 0, 2]
shock_payoffs_row = [2, 1, 1, 1]
shock_payoffs_col = [1, 1, 1, 2]
# Sensitivity ranges
memory_lengths = [1, 2, 3, 4, 5]
epsilons = [0, 0.01, 0.05, 0.1, 0.15]
timeperiods = [100]
shock_times = [48]
shock_durations = [4]
# Precompute canonical distributions
canonical_sets = {}
for mem in memory_lengths:
canonical_sets[mem] = generate_all_distributions(num_rows, num_cols, mem)
# Run main analysis
results_df = run_sensitivity(
initial_state_mode='canonical', # or 'random', 'balanced'
num_rows=num_rows,
num_cols=num_cols,
row_player_payoffs=row_payoffs,
column_player_payoffs=col_payoffs,
shock_payoff_row=shock_payoffs_row,
shock_payoff_col=shock_payoffs_col,
memory_lengths=memory_lengths,
epsilons=epsilons,
timeperiods=timeperiods,
shock_times=shock_times,
shock_durations=shock_durations,
canonical_sets=canonical_sets,
n_runs_per_setting=10, # Monte Carlo runs per setting
export_dir='outputs/analysis',
iter_name='test_run_1',
random_seed=123,
response_rule='exhaustive'
)
# Generate all outputs (tables + visualizations)
outputs = generate_all_outputs(
results_df,
output_dir='outputs/figures'
)
# Access generated tables
table1 = outputs['table1'] # Master results table
table2 = outputs['table2'] # Initial state binned analysis
table3 = outputs['table3'] # Shock pair comparisonExhaustive Best Response Rule (Default)
# Agents choose an action from the set of best responses to any distinct
# opponent action observed in history, plus epsilon-greedy exploration
response_rule = 'exhaustive'Expected Payoff Rule
# Agents estimate empirical frequency distribution of opponent actions,
# compute expected payoffs for each own action, then pick best responses
# plus epsilon-greedy exploration
response_rule = 'expected'The library generates two types of outputs:
1. Data Files (export_dir)
monte_carlo_results_[iter_name].xlsx— Main results table (one row per setting, run, and pair)traj_mem*.xlsx— Individual trajectory files for each run
2. Analysis Files (output_dir)
table_norm_epsilon_memory.xlsx— Summary statistics by memory length and epsilontable_norm_initial_state_bin.xlsx— Recovery by initial state distributiontable_norm_shock_pair.xlsx— Results by shock joint action- PNG visualizations — Heatmaps and line plots
| Parameter | Type | Description |
|---|---|---|
num_rows, num_cols |
int | Game dimensions (rows = row player actions, cols = column player actions) |
row_player_payoffs |
list | Payoff vector for row player (length = rows × cols) |
column_player_payoffs |
list | Payoff vector for column player (length = rows × cols) |
shock_payoff_row |
list | Payoff vector during shock period for row player |
shock_payoff_col |
list | Payoff vector during shock period for column player |
| Parameter | Type | Default | Description |
|---|---|---|---|
memory_length |
int | — | Number of past joint actions agents remember |
epsilon_row, epsilon_col |
float | 0.1 | Probability of random action (per player) |
response_rule |
str | 'exhaustive' | Decision rule: 'exhaustive' or 'expected' |
initial_state_mode |
str | — | How to initialize: 'random', 'balanced', or 'canonical' |
| Parameter | Type | Description |
|---|---|---|
timeperiod |
int | Number of periods to simulate |
shock_time |
int | Period when shock begins (or None for no shock) |
shock_duration |
int | Number of periods shock lasts |
n_runs_per_setting |
int | Monte Carlo replications per parameter combination |
random_seed |
int | Seed for reproducibility |
simulate_two_player_game()— Run a single game trajectorysimulate_with_shock()— Single trajectory with payoff shockrun_sensitivity()— Full Monte Carlo analysis across parameter grid
build_table_norm_epsilon_memory()— Aggregate by memory and epsilonbuild_table_norm_shock_pair()— Compare shock outcomesbuild_table_norm_initial_state_bin()— Bin by initial state and compare
generate_all_outputs()— Produces all tables and plots automaticallyplot_recovery_heatmap_grid()— 2×2 grid of recovery time heatmapsplot_pre_shock_heatmap_grid()— 2×2 grid of pre-shock frequency heatmapsplot_individual_run()— Trajectory visualization with shock window
This library implements a memory-based best-response model of strategic interaction:
- History: Each agent maintains a list of recent joint actions (payoff pairs played)
- Play: At each time step:
- Extract distinct opponent actions from history
- Compute best responses (or expected payoffs)
- Apply epsilon-greedy action selection
- Update: Add the joint action to history (remove oldest if memory exceeded)
- Shock: Optionally change payoff matrices temporarily
- Recovery: Track time until behavior returns to pre-shock norm
Key metric: Recovery time = periods until a norm's frequency returns to pre-shock level.
numpy— Numerical computingpandas— Data manipulation and analysismatplotlib— Plotting libraryseaborn— Statistical visualizationopenpyxl— Excel file I/O
All included in environment.yml.
See examples.py for a complete working example showing:
- Parameter setup
- Canonical distribution precomputation
- Sensitivity analysis execution
- Output generation
- Result visualization
To run the example:
python examples.pyShows how frequently each joint action is played before / after the shock. Used to identify which norms are disrupted and which persist.
Number of periods until a norm's frequency returns to at least its pre-shock level. Longer recovery = less resilient norms.
Visualize how recovery time (or pre-shock frequency) varies across memory length and epsilon parameters. Identify which parameter combinations produce robust vs. fragile norms.
- Random Seed: Set
random_seedfor reproducibility - Memory Length: Larger memory captures more history but increases state space
- Epsilon: Higher epsilon = more randomness = less stable norms
- Initial State: 'canonical' requires precomputation but covers the full distribution space
- Export Path: Create parent directories in advance if needed; the library will handle subdirectory creation
MIT License
If you use this library in your research, please cite it appropriately. Citation format to be added upon publication.