This repository contains the code and data for the execution and analysis of the malaria simulation model developed by the Temple University Boni Lab. The model is a stochastic, individual-based model that simulates the transmission of malaria in a population of humans and mosquitoes.
This repository serves two related purposes:
- Code for running the simulation model (both locally for testing and on a cluster for large-scale simulations)
- Code for analyzing the output of the simulation model
This repository is for operation and analysis and requires the binaries compiled by the malaria simulation model repository. To install, simply clone this repository locally. Once cloned, the simulation binaries MaSim and DxGGenerator that are built from the simulation model should be copied to the bin directory of this repository.
These binaries are typically built in the /build/bin directory of the model repository. Given that this model runs both locally on a workstation (typically running Ubuntu or macOS) and on a cluster (typically running CentOS), the binaries should be compiled for the appropriate platforms locally and copied to the bin directory. A CI/CD pipeline to build these binaries should be implemented in the future and this repository should be updated to instead depend on those binaries.
This repository includes a Python package (masim-analysis) with command-line tools for calibration, validation, and workflow automation. Install the package and its dependencies:
# Using pip
pip install -e .
# Or using uv (if available)
uv pip install -e .For cluster deployment, use the provided build script:
scripts/server_build.bash
source .venv/bin/activateThe package provides several command-line tools to streamline calibration, validation, and simulation workflows. These commands are available after installing the package:
Runs the complete calibration pipeline for a country, including configuration generation, simulation execution, and beta map creation.
calibrate <country_code> [-r REPETITIONS] [-o OUTPUT_DIR]Example:
calibrate moz -r 20 -o outputExecutes validation simulations using calibrated parameters and generates validation statistics.
validate <country_code> [-r REPETITIONS] [-o OUTPUT_DIR]Example:
validate moz -r 50 -o outputGenerates MaSim simulation commands and PBS job files for cluster execution.
Subcommands:
-
Generate commands for a single configuration:
commands generate -c CONFIG_FILE [-o OUTPUT_DIR] [-r REPETITIONS] [-n NAME]
-
Batch generate commands from multiple configurations:
commands batch -i INPUT_DIR -o OUTPUT_DIR [-r REPETITIONS] [-n COMMANDS_FILE]
-
Generate PBS job file:
commands job -f COMMANDS_FILE [-d NODE] [-n JOB_NAME] [-c CORES] [-t HOURS]
Examples:
# Generate commands for a single strategy
commands generate -c conf/moz/moz_validation.yml -o output/moz/validation -r 50
# Batch generate commands from all configs in a directory
commands batch -i conf/rwa -o output/rwa -r 100
# Create a job file for cluster execution
commands job -f commands.txt -d nd01 -n MozValidation -t 24Launches an interactive terminal menu for executing common workflows without remembering command syntax.
masimThis provides a menu-driven interface for:
- Generating simulation commands
- Batch command generation
- Creating PBS job files
- Running calibration pipelines
- Setting up new country directories
The recommended approach for running simulations is to use the provided command-line tools, which handle configuration generation, command creation, and cluster job submission automatically.
Typical Workflow:
Note: This requires a preprocessing step done locally to prepare data files before running these commands on the cluster. At a minimum, this includes preparing population, seasonality, and treatment access rate, and prevalence raster data files in the data directory and a baseline configuration file in the conf directory for the country being simulated.
-
Calibration: Generate calibrated beta value map for a country
calibrate <country_code> -r 20
-
Validation: Validate the calibrated model
validate <country_code> -r 50
-
Strategy Testing: Generate commands for different treatment strategies
commands batch -i conf/<country> -o output/<country> -r 100
-
Cluster Execution: Create and submit job files
commands job -f commands.txt -d nd01 -n JobName -t 48 qsub job.pbs
For manual execution or testing, the simulation binaries can be called directly from the root folder:
./bin/MaSim -i ./conf/<input_file>.yml -o ./output/<output_folder> -r SQLiteDistrictReporterImportant Notes:
- File paths in configuration files should be relative to the root folder
- Output directories must be created before running simulations
- The MaSim binary runs a single simulation instance; use the
commandstool and PBS job submission for parallel execution
Use the following conventions for organizing data and configurations:
- The
datafolder contains all country-specific data files (organized by country code in subfolders) - The
conffolder contains all configuration files for simulations, organized by country - Configuration file names should describe the strategy being tested
- Output files should be organized by country and strategy, named as
<country code>/<strategy>/<strategy>_<repetition>.db - Templates for standard configurations are available in the
templatesfolder
The simulation generates a lot of data, and at the moment this repo handles both software and data. To transfer source files (code, configurations, raster data --- anything under the src, conf, or data folders --- check them into version control via Git. Transfer using git push/pull. Individual country calibrations should take place on their own branches. Output data can be transferred using the scp command.