Skip to content

ktruby-oss/DDLSim-Lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

40 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

DDLSim-Lab placeholder logo

DDLSim-Lab

Distributed simulation environment using bare metal nodes

Research GitHub Repo stars GitHub forks License Open Source Love Maintenance PRs Welcome Issues GitHub last commit GitHub contributors Documentation

Python PyTorch MPI Docker K8s Prometheus Grafana ZeroMQ


DDLSim-Lab

Distributed Simulation Environment for Systems Research

Created by Kaitlyn Brishae Truby (Kaitlyn Truby) | Contributed by many American universities


Abstract

DDLSim-Lab is an open-source research project that provides a distributed simulation environment using bare metal nodes for performance evaluation of simulation workloads on cloud vs bare metal and scalable lab infrastructure for cybersecurity training.
The platform enables experimentation with large-scale, heterogeneous, and failure-prone environments, including edge-cloud hybrid infrastructures.
It provides researchers with a reproducible, cost-free environment to test novel algorithms, scheduling strategies, and fault-tolerance mechanisms without requiring access to expensive physical testbeds.

Why Bare-Metal is Required

To achieve scientifically valid results, DDLSim-Lab must run on bare-metal infrastructure. Virtualization layers introduce non-deterministic noise that undermines the fidelity of network and performance measurements. Bare-metal deployment ensures:

  • Elimination of virtualization overhead – CPU, memory, and I/O performance reflect true hardware capabilities, essential for accurate scaling studies.
  • Realistic network behavior – Direct access to NICs enables precise emulation of latency, jitter, packet loss, and link failures as they occur in production data centers and edge sites.
  • High-fidelity distributed experiments – Kernel-level networking, SR-IOV, and RDMA support allow experimentation with cutting-edge interconnect technologies without abstraction artifacts.

πŸ”— Project Connections

DDLSim-Lab addresses three critical challenges in systems research:

  1. Lack of bare-metal testbeds – Access to bare-metal infrastructure for experiments is limited and expensive.
  2. Need for reproducible performance evaluation – Comparing cloud vs bare metal requires controlled environments.
  3. Demand for cybersecurity training environments – Scalable, isolated environments for hands-on security training.

Research Use Cases

The platform supports a wide range of investigative studies:

  • Bare-metal performance benchmarking – Measure and compare hardware performance for various workloads.
  • Cloud vs bare metal comparison studies – Evaluate trade-offs between cloud and bare metal for different applications.
  • Distributed systems protocol testing and validation – Test consensus protocols, distributed databases, and middleware under controlled conditions.
  • Networking experiments (SDN, NFV, 5G, etc.) – Simulate network functions, software-defined networking, and next-generation wireless networks.
  • Cybersecurity attack and defense training – Create realistic training environments for red team/blue team exercises.
  • Kubernetes benchmarking and optimization – Optimize container orchestration performance and resource utilization.
  • Container security and isolation studies – Analyze container escape vulnerabilities and isolation mechanisms.
  • Edge computing simulation – Model heterogeneous edge-cloud hierarchies with constrained devices and intermittent connectivity.
  • High-performance computing (HPC) workload evaluation – Evaluate MPI applications, scientific simulations, and AI training workloads.

πŸ”‘ Key Contributions

DDLSim-Lab introduces several novel contributions to the systems research landscape:

  • Bare-metal simulation framework – A lightweight, high-fidelity simulator for deploying and managing experiments on bare-metal infrastructure.
  • Cloud-bare metal comparison tools – Automated tools for provisioning, configuring, and measuring performance across cloud and bare metal environments.
  • Network emulation engine – Advanced network emulation capabilities for simulating WAN characteristics, packet loss, latency, and jitter.
  • Cybersecurity range generator – Automated generation of isolated, scalable environments for cybersecurity training and experimentation.
  • Kubernetes benchmark suite – Standardized benchmarks for evaluating Kubernetes performance under various workloads and configurations.
  • Reproducible experiment pipelines – Integrated logging, version-controlled configurations, and automated report generation ensure experimental repeatability.
  • Real-time observability – Built-in Prometheus exporters and Grafana dashboards provide live metrics on throughput, latency, resource utilization, and error rates.
  • Extensible architecture – Modular design allows researchers to add new workloads, network models, and evaluation metrics.

Installation

# Clone the repository
git clone https://github.com/ktruby-oss/DDLSim-Lab.git
cd DDLSim-Lab

# Install dependencies
pip install -r requirements.txt

# (Optional) Install system-level tools for full functionality
# sudo apt-get install -y iproute2 iptables docker.io docker-compose

Running the Simulator

# Basic 2-node networking simulation with random latency and packet loss
python main.py

# Advanced configuration example
python main.py --config configs/edge_cloud_hybrid.yaml

# Launch the automated daily training pipeline (see docs/automation.md)
./daily_pipeline.sh

πŸ“š Documentation

🀝 Contributing

We welcome contributions from researchers, engineers, and students worldwide! Whether you want to:

  • Add new workload models or network topologies
  • Implement novel benchmarking suites
  • Improve the observability pipeline
  • Write tutorials or publish case studies

Please read our Contributing Guide for details on submitting pull requests, reporting issues, and joining our discussions.

🏒 Call for Infrastructure Partners (Testbeds)

DDLSim-Lab is a fully open-source research project available to any researcher, professor, or student worldwide. To advance this project and enable larger-scale experiments, we are actively seeking contributions of bare-metal servers, cloud computing credits, and advanced networking hardware. Your support directly accelerates systems research and helps the global scientific community. We welcome partnerships with testbeds, universities, and technology providers.

πŸ™ Acknowledgments

This work is made possible by contributions from:

  • American universities and research institutions
  • Professors and researchers who provide valuable feedback and use cases
  • Government agencies that support open-source systems research initiatives
  • Open-source community – Contributors of PyTorch, MPI4Py, Prometheus, Grafana, and related tools.

πŸ“ Citation

If you use DDLSim-Lab in your research, please cite our project:

@software{truby2025ddlsimlab,
  author = {Truby, Kaitlyn Brishae},
  title = {DDLSim-Lab: Distributed Simulation Environment using Bare Metal Nodes},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/ktruby-oss/DDLSim-Lab}}
}

πŸ“œ License

DDLSim-Lab is licensed under the MIT License.

πŸ“« Contact

For questions, collaborations, or media inquiries:

"The best way to predict the future is to simulate it."
β€” Adapted from Alan Kay