Distributed simulation environment using bare metal nodes
Created by Kaitlyn Brishae Truby (Kaitlyn Truby) | Contributed by many American universities
DDLSim-Lab is an open-source research project that provides a distributed simulation environment using bare metal nodes for performance evaluation of simulation workloads on cloud vs bare metal and scalable lab infrastructure for cybersecurity training.
The platform enables experimentation with large-scale, heterogeneous, and failure-prone environments, including edge-cloud hybrid infrastructures.
It provides researchers with a reproducible, cost-free environment to test novel algorithms, scheduling strategies, and fault-tolerance mechanisms without requiring access to expensive physical testbeds.
To achieve scientifically valid results, DDLSim-Lab must run on bare-metal infrastructure. Virtualization layers introduce non-deterministic noise that undermines the fidelity of network and performance measurements. Bare-metal deployment ensures:
- Elimination of virtualization overhead β CPU, memory, and I/O performance reflect true hardware capabilities, essential for accurate scaling studies.
- Realistic network behavior β Direct access to NICs enables precise emulation of latency, jitter, packet loss, and link failures as they occur in production data centers and edge sites.
- High-fidelity distributed experiments β Kernel-level networking, SR-IOV, and RDMA support allow experimentation with cutting-edge interconnect technologies without abstraction artifacts.
DDLSim-Lab addresses three critical challenges in systems research:
- Lack of bare-metal testbeds β Access to bare-metal infrastructure for experiments is limited and expensive.
- Need for reproducible performance evaluation β Comparing cloud vs bare metal requires controlled environments.
- Demand for cybersecurity training environments β Scalable, isolated environments for hands-on security training.
The platform supports a wide range of investigative studies:
- Bare-metal performance benchmarking β Measure and compare hardware performance for various workloads.
- Cloud vs bare metal comparison studies β Evaluate trade-offs between cloud and bare metal for different applications.
- Distributed systems protocol testing and validation β Test consensus protocols, distributed databases, and middleware under controlled conditions.
- Networking experiments (SDN, NFV, 5G, etc.) β Simulate network functions, software-defined networking, and next-generation wireless networks.
- Cybersecurity attack and defense training β Create realistic training environments for red team/blue team exercises.
- Kubernetes benchmarking and optimization β Optimize container orchestration performance and resource utilization.
- Container security and isolation studies β Analyze container escape vulnerabilities and isolation mechanisms.
- Edge computing simulation β Model heterogeneous edge-cloud hierarchies with constrained devices and intermittent connectivity.
- High-performance computing (HPC) workload evaluation β Evaluate MPI applications, scientific simulations, and AI training workloads.
DDLSim-Lab introduces several novel contributions to the systems research landscape:
- Bare-metal simulation framework β A lightweight, high-fidelity simulator for deploying and managing experiments on bare-metal infrastructure.
- Cloud-bare metal comparison tools β Automated tools for provisioning, configuring, and measuring performance across cloud and bare metal environments.
- Network emulation engine β Advanced network emulation capabilities for simulating WAN characteristics, packet loss, latency, and jitter.
- Cybersecurity range generator β Automated generation of isolated, scalable environments for cybersecurity training and experimentation.
- Kubernetes benchmark suite β Standardized benchmarks for evaluating Kubernetes performance under various workloads and configurations.
- Reproducible experiment pipelines β Integrated logging, version-controlled configurations, and automated report generation ensure experimental repeatability.
- Real-time observability β Built-in Prometheus exporters and Grafana dashboards provide live metrics on throughput, latency, resource utilization, and error rates.
- Extensible architecture β Modular design allows researchers to add new workloads, network models, and evaluation metrics.
# Clone the repository
git clone https://github.com/ktruby-oss/DDLSim-Lab.git
cd DDLSim-Lab
# Install dependencies
pip install -r requirements.txt
# (Optional) Install system-level tools for full functionality
# sudo apt-get install -y iproute2 iptables docker.io docker-compose# Basic 2-node networking simulation with random latency and packet loss
python main.py
# Advanced configuration example
python main.py --config configs/edge_cloud_hybrid.yaml
# Launch the automated daily training pipeline (see docs/automation.md)
./daily_pipeline.sh- Installation Guide
- Configuration Reference
- Experiment Tutorials
- API Documentation
- Contributor Guidelines
- Code of Conduct
We welcome contributions from researchers, engineers, and students worldwide! Whether you want to:
- Add new workload models or network topologies
- Implement novel benchmarking suites
- Improve the observability pipeline
- Write tutorials or publish case studies
Please read our Contributing Guide for details on submitting pull requests, reporting issues, and joining our discussions.
DDLSim-Lab is a fully open-source research project available to any researcher, professor, or student worldwide. To advance this project and enable larger-scale experiments, we are actively seeking contributions of bare-metal servers, cloud computing credits, and advanced networking hardware. Your support directly accelerates systems research and helps the global scientific community. We welcome partnerships with testbeds, universities, and technology providers.
This work is made possible by contributions from:
- American universities and research institutions
- Professors and researchers who provide valuable feedback and use cases
- Government agencies that support open-source systems research initiatives
- Open-source community β Contributors of PyTorch, MPI4Py, Prometheus, Grafana, and related tools.
If you use DDLSim-Lab in your research, please cite our project:
@software{truby2025ddlsimlab,
author = {Truby, Kaitlyn Brishae},
title = {DDLSim-Lab: Distributed Simulation Environment using Bare Metal Nodes},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/ktruby-oss/DDLSim-Lab}}
}DDLSim-Lab is licensed under the MIT License.
For questions, collaborations, or media inquiries:
- Project Lead: Kaitlyn Brishae Truby (Kaitlyn Truby)
- Email: kaitlyn.truby@student.uagc.edu
"The best way to predict the future is to simulate it."
β Adapted from Alan Kay