A production-grade simulator for dynamic load balancing algorithms with AI-powered optimization
Quick Start • Features • Algorithms • Documentation • Contributing
# Clone the repository
git clone https://github.com/Auankj/dynamic_load_balancer.git
cd dynamic_load_balancer
# Create virtual environment
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Launch the GUI
python main.py🖼️ Screenshot Preview
The GUI features:
- Real-time processor load visualization with color-coded bars
- Interactive Gantt chart showing process execution timeline
- Live metrics dashboard with performance statistics
- Algorithm comparison with side-by-side analysis
Load balancing is a critical operating system technique that distributes workloads across multiple processors to maximize efficiency. This simulator provides:
| Goal | Description |
|---|---|
| 🚀 Maximize Throughput | Complete more work in less time |
| ⚡ Minimize Response Time | Users get faster responses |
| ⚖️ Optimize Utilization | All processors stay busy |
| 🛡️ Prevent Bottlenecks | No single processor gets overwhelmed |
| Feature | Description |
|---|---|
| Multi-Processor | Configure 2-16 virtual processors with customizable speed |
| Process Types | CPU-bound, I/O-bound, Real-time, Batch, Interactive |
| Workload Patterns | Uniform, Bursty, Poisson, Diurnal, Spike, Wave |
| 13 Algorithms | Load Balancing + Classic CPU Scheduling |
| AI-Powered | Deep reinforcement learning with PyTorch (GPU accelerated) |
| Process Migration | Dynamic load rebalancing across processors |
| Feature | Description |
|---|---|
| Q-Learning | Discrete state-space reinforcement learning |
| Deep Q-Network (DQN) | Neural network with experience replay |
| Double DQN | Reduced overestimation bias |
| Prioritized Replay | Focus on important experiences |
| Model Persistence | Save/load trained models automatically |
| Feature | Description |
|---|---|
| Process Types | CPU_BOUND, IO_BOUND, MIXED, REAL_TIME, BATCH, INTERACTIVE |
| Workload Patterns | UNIFORM, BURSTY, POISSON, DIURNAL, SPIKE, GRADUAL_RAMP, WAVE |
| Advanced Processors | Multi-level feedback queue, cache simulation, power states |
| Scenario System | Predefined and custom simulation scenarios |
| SLA Tracking | Service Level Agreement metrics and violations |
Real-time visualization of processor loads with intelligent color-coding:
0% 50% 100%
├───────────────────────┼───────────────────────┤
🟢 Green (0-40%) 🟡 Yellow (40-70%) 🔴 Red (70-100%)
- Load Bars: Animated horizontal bars showing current CPU utilization
- Color Gradient: Smooth transitions between load states
- Queue Indicators: Number badges showing pending processes per CPU
Interactive process execution timeline with rich details:
- Process Blocks: Color-coded by process priority/type
- Time Scale: Auto-adjusting based on simulation duration
- Hover Details: Process ID, burst time, start/end times
- Migration Lines: Dashed connectors showing process migrations
- Zoom & Pan: Mouse wheel zoom, drag to navigate
Live updating charts with multiple metrics:
| Metric Type | Visualization | Update Frequency |
|---|---|---|
| CPU Utilization | Line chart with fill | Every simulation tick |
| Throughput | Stacked area chart | Per completed process |
| Wait Times | Box plot with outliers | Per process completion |
| Queue Depths | Multi-line overlay | Every scheduling decision |
| Load Variance | Gradient heat strip | Continuous |
Side-by-side performance analysis for all 13 algorithms:
- Grouped Bar Charts: Compare avg wait, turnaround, response times
- Radar Charts: Multi-dimensional algorithm profiling
- Ranking Tables: Sortable by any metric
- Export Options: PNG, SVG, CSV data export
| Chart Type | Description |
|---|---|
| 🔄 Queue Visualization | Ready queue depth per processor with process IDs |
| 🎯 Fairness Index | Jain's fairness index dial (0.0 to 1.0) |
| ⏱️ Response Time Distribution | Histogram with percentile markers (p50, p95, p99) |
| 🔥 Utilization Heatmap | Time × Processor matrix showing load intensity |
| 📊 Process Timeline | Individual process lifecycle from arrival to completion |
| 🌡️ System Temperature | Overall system load gauge with thresholds |
- Process Metrics — Turnaround time, waiting time, response time
- Processor Metrics — CPU utilization, queue length, throughput
- System Metrics — Load variance, Jain's fairness index, migrations
- Data Export — JSON and CSV export for external analysis
┌────────────────────────────────────────────────────────────────────┐
│ GUI Layer │
│ (gui.py) │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐ │
│ │ Load Bars │ │ Metrics │ │ Charts │ │ Controls │ │
│ └────────────┘ └────────────┘ └────────────┘ └────────────────┘ │
└────────────────────────────────┬───────────────────────────────────┘
│
┌────────────────────────────────▼───────────────────────────────────┐
│ Simulation Layer │
│ (simulation.py / enhanced_simulation.py) │
│ ┌────────────────────────────────────────────────────────────┐ │
│ │ SimulationEngine │ │
│ │ Time Management • Event Processing • State Control │ │
│ └────────────────────────────────────────────────────────────┘ │
└──────┬──────────────────┬───────────────────┬──────────────────────┘
│ │ │
┌──────▼──────┐ ┌───────▼───────┐ ┌───────▼───────┐
│Load Balancer│ │ Processor │ │ Metrics │
│ │ │ │ │ │
│• RoundRobin │ │• Execution │ │• Process │
│• LeastLoaded│ │• Queue Mgmt │ │• Processor │
│• Threshold │ │• Migration │ │• System │
│• Q-Learning │ │• Power States │ │• SLA Tracking │
│• DQN │ │• Cache Sim │ │ │
└──────┬──────┘ └───────┬───────┘ └───────────────┘
│ │
┌──────▼──────────────────▼──────┐
│ Core Layer │
│ config.py • process.py │
│ utils.py • validators.py │
│ advanced_simulation.py │
│ integration.py │
└────────────────────────────────┘
| Pattern | Implementation | Purpose |
|---|---|---|
| Strategy | LoadBalancer ABC | Swappable algorithms |
| Factory | LoadBalancerFactory | Algorithm instantiation |
| Observer | GUI callbacks | Real-time updates |
| Builder | ScenarioBuilder | Custom scenario creation |
| Singleton | SimulationLogger | Centralized logging |
| Algorithm | Speed | Balance | Adaptability | Best For |
|---|---|---|---|---|
| Round Robin | ⭐⭐⭐ | ⭐ | ⭐ | Uniform workloads |
| Least Loaded | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ | Variable workloads |
| Threshold | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Dynamic environments |
| Q-Learning | ⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Pattern-rich workloads |
| DQN | ⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐ | Complex continuous states |
Distributes processes cyclically: P0→P1→P2→P3→P0...
def assign(self, process, processors):
target = processors[self.current_index]
self.current_index = (self.current_index + 1) % len(processors)
return target✅ Simple, predictable, zero overhead
❌ Ignores actual load, can create imbalance
Assigns to the processor with the lowest current load
def assign(self, process, processors):
return min(processors, key=lambda p: p.current_load)✅ Optimal distribution, adapts to state
❌ O(n) per assignment, monitoring overhead
Migrates processes when load difference exceeds threshold
def check_balance(self, processors):
loads = [p.current_load for p in processors]
if max(loads) - min(loads) > self.threshold:
self.migrate_process(overloaded, underloaded)✅ Dynamic rebalancing, prevents severe imbalance
❌ Migration has cost, needs threshold tuning
Learns optimal assignments through reinforcement learning
def assign(self, process, processors):
state = self.encode_state(processors, process)
if self.training and random() < self.epsilon:
action = random_choice(len(processors)) # Explore
else:
action = argmax(self.Q[state]) # Exploit
return processors[action]✅ Learns optimal strategy, improves over time
❌ Needs training, initial random behavior
Neural network approximates Q-function for continuous states
class DQNetwork(nn.Module):
def __init__(self, state_dim, action_dim):
self.fc1 = nn.Linear(state_dim, 128)
self.fc2 = nn.Linear(128, 256)
self.fc3 = nn.Linear(256, 128)
self.out = nn.Linear(128, action_dim)✅ Handles continuous states, excellent generalization
❌ Requires PyTorch, more computationally intensive
| Mode | Exploration (ε) | Purpose | When to Use |
|---|---|---|---|
| Train | 100% → 5% | Learn optimal strategies | First runs, new patterns |
| Exploit | Fixed 1% | Use learned knowledge | After training complete |
Recommended Training:
- Q-Learning: 500-2000+ process assignments
- DQN: 1000-5000+ process assignments
The simulator also includes 8 classic CPU scheduling algorithms that every OS student should know!
| Algorithm | Type | Preemptive | Optimal For | Starvation |
|---|---|---|---|---|
| FCFS | 📋 FIFO | ❌ | Simplicity | ❌ |
| SJF | ⏱️ Burst | ❌ | Avg Wait Time | |
| SRTF | ⏱️ Burst | ✅ | Response Time | |
| Priority | 🎯 Priority | ❌ | Critical Tasks | |
| Priority (P) | 🎯 Priority | ✅ | Urgent Tasks | |
| Multilevel Queue | 📊 Class | ✅ | Mixed Workloads | |
| MLFQ | 🧠 Adaptive | ✅ | General Purpose | ❌ |
| EDF | ⏰ Deadline | ✅ | Real-Time | ❌ |
The OG of schedulers. Whoever comes first gets the CPU first.
def select_next(self):
return sorted(self.queue, key=lambda p: p.arrival_time)[0]✅ Simple, no starvation, minimal overhead
❌ Convoy effect — one big process blocks everyone
Picks the process with the shortest burst time. Productivity king! 👑
def select_next(self):
return min(self.queue, key=lambda p: p.burst_time)✅ Optimal average waiting time (provably!)
❌ Long jobs may starve, needs burst time prediction
The chaotic younger sibling of SJF. Preemptive version!
def should_preempt(self, new_process):
return new_process.remaining_time < self.current.remaining_time✅ Even better response time than SJF
❌ High overhead, long jobs get ghosted constantly 👻
CPU goes to the highest priority process. VIP treatment! 🎖️
def select_next(self):
# Uses aging to prevent starvation
for p in self.queue:
p.effective_priority = p.priority - p.wait_time // 10
return min(self.queue, key=lambda p: p.effective_priority)✅ Critical processes get attention, flexible
❌ Can starve low priority (solved with aging!)
Every process gets a time slice (quantum). Fair & democratic! 🗳️
def tick(self):
self.quantum_remaining -= 1
if self.quantum_remaining <= 0:
self.queue.append(self.current) # Back of queue
self.current = self.queue.popleft()✅ Fair, good response time, no starvation
❌ Quantum too small = too many switches, too big = becomes FCFS
Think of it like a school with different sections! 🏫
┌─────────────────────────────────────┐
│ Queue 0: System Processes (RR, q=8) │ ← Highest Priority
├─────────────────────────────────────┤
│ Queue 1: Interactive (RR, q=4) │
├─────────────────────────────────────┤
│ Queue 2: Batch Jobs (FCFS) │
├─────────────────────────────────────┤
│ Queue 3: Idle Processes │ ← Lowest Priority
└─────────────────────────────────────┘
✅ Good for categorized workloads, optimizes each queue
❌ Inflexible — processes stuck in their queue
The genius, adaptive version. Processes can MOVE between queues! 🧠
The Rules:
- 🆕 New processes start at top queue (highest priority)
- ⬇️ Use full quantum? Get demoted to lower queue
- ⬆️ Give up CPU early (I/O)? Stay or get promoted
- 🔄 Periodic priority boost prevents starvation
def after_quantum(self, process):
if process.used_full_quantum:
self.demote(process) # CPU hog detected!
else:
self.promote(process) # Nice I/O-bound process✅ Adapts automatically, no prior knowledge needed
❌ Complex to tune, can be gamed
For real-time systems. Nearest deadline = gets the CPU! ⏰
def select_next(self):
return min(self.queue, key=lambda p: p.deadline)✅ Optimal for single processor real-time (can achieve 100% utilization!)
❌ Deadline miss cascade — one miss can cause many
dynamic_load_balancer/
├── 🎯 Core Modules
│ ├── main.py # Application entry point
│ ├── config.py # Configuration and constants
│ ├── process.py # Process model and generator
│ ├── processor.py # Processor execution logic
│ └── simulation.py # Standard simulation engine
│
├── 🤖 AI & Scheduling Modules
│ ├── load_balancer.py # Algorithm implementations
│ ├── ai_balancer.py # Q-Learning balancer
│ ├── dqn_balancer.py # Deep Q-Network balancer
│ └── scheduling_algorithms.py # Classic CPU schedulers (FCFS, SJF, etc.)
│
├── 🚀 Advanced Simulation
│ ├── advanced_simulation.py # Enhanced process/processor models
│ ├── enhanced_simulation.py # Production-grade engine
│ └── integration.py # Scenario management
│
├── 📊 Support Modules
│ ├── metrics.py # Performance metrics
│ ├── gui.py # Tkinter GUI
│ ├── utils.py # Logging and export
│ └── validators.py # Input validation
│
├── 🧪 Testing
│ └── test_suite.py # 133 comprehensive tests
│
└── 📄 Documentation
├── README.md # This file
└── requirements.txt # Dependencies
| Module | Purpose | Key Classes |
|---|---|---|
config.py |
Configuration | SimulationConfig, LoadBalancingAlgorithm |
process.py |
Process model | Process, ProcessGenerator |
processor.py |
Execution | Processor, ProcessorManager |
load_balancer.py |
Algorithms | RoundRobin, LeastLoaded, Threshold |
ai_balancer.py |
Q-Learning | QLearningAgent, StateEncoder |
dqn_balancer.py |
Deep RL | DQNAgent, DQNetwork, PrioritizedReplay |
scheduling_algorithms.py |
CPU Scheduling | FCFS, SJF, SRTF, Priority, MLFQ, EDF |
advanced_simulation.py |
Advanced models | AdvancedProcess, AdvancedProcessor |
enhanced_simulation.py |
Production engine | EnhancedSimulationEngine |
integration.py |
Scenarios | ScenarioBuilder, PerformanceAnalyzer |
| Scenario | Processors | Processes | Pattern | Description |
|---|---|---|---|---|
| Basic | 4 | 20 | Uniform | Standard simulation |
| CPU Intensive | 8 | 30 | Uniform | Long-running computation |
| I/O Intensive | 4 | 40 | Bursty | Frequent blocking |
| Mixed Workload | 6 | 50 | Diurnal | Real-world simulation |
| Bursty Traffic | 4 | 60 | Spike | Sudden load spikes |
| Real-Time | 8 | 25 | Uniform | Strict deadlines |
| Stress Test | 4 | 100 | Spike | Maximum load testing |
| Metric | Formula | Description |
|---|---|---|
| Turnaround Time | completion - arrival |
Total time in system |
| Waiting Time | start - arrival |
Time in queue |
| Response Time | first_run - arrival |
Time to first execution |
| CPU Utilization | busy_time / total_time |
Processor efficiency |
| Throughput | completed / time |
Processes per time unit |
| Jain's Fairness | (Σx)² / (n × Σx²) |
Load distribution fairness |
- J = 1.0: Perfect fairness (equal load)
- J = 1/n: Worst case (all load on one processor)
# Run all 125 tests
python -m pytest test_suite.py -v
# Run specific test class
python -m pytest test_suite.py::TestDQNBalancer -v
# Run with coverage
python -m pytest test_suite.py --cov=. --cov-report=html| Category | Tests | Coverage |
|---|---|---|
| Configuration | 6 | Config creation, defaults |
| Process Model | 12 | Lifecycle, state transitions |
| Processor | 14 | Execution, queue management |
| Load Balancers | 20 | All algorithm correctness |
| Q-Learning | 15 | Agent training, inference |
| DQN | 20 | Neural network, replay buffer |
| Simulation | 12 | Engine initialization, execution |
| Metrics | 13 | Calculations, edge cases |
| Integration | 6 | End-to-end workflows |
| Edge Cases | 7 | Boundary conditions |
# Create and run simulation
from simulation import SimulationEngine
from config import SimulationConfig, LoadBalancingAlgorithm
config = SimulationConfig(num_processors=4, num_processes=20)
engine = SimulationEngine(config)
engine.initialize(algorithm=LoadBalancingAlgorithm.DQN)
while not engine.is_complete():
engine.step()
result = engine.get_result()
print(f"Avg Turnaround: {result.system_metrics.avg_turnaround_time:.2f}")# Use scenario builder
from integration import ScenarioBuilder, IntegratedSimulationManager
from advanced_simulation import WorkloadPattern, ProcessType
scenario = (ScenarioBuilder("Custom Test")
.with_processors(8)
.with_processes(50)
.with_workload(WorkloadPattern.BURSTY)
.with_algorithm(LoadBalancingAlgorithm.DQN)
.build())
manager = IntegratedSimulationManager(use_enhanced=True)
manager.load_scenario(scenario)
manager.initialize()
manager.start()| Parameter | Default | Range | Description |
|---|---|---|---|
num_processors |
4 | 2-16 | Number of processors |
num_processes |
20 | 1-100 | Processes to generate |
time_quantum |
4 | 1-20 | Round robin time slice |
min_burst_time |
1 | 1-100 | Minimum burst time |
max_burst_time |
20 | 1-1000 | Maximum burst time |
migration_threshold |
0.3 | 0.0-1.0 | Load diff for migration |
source venv/bin/activate
python main.py.\venv\Scripts\Activate.ps1
python main.pyvenv\Scripts\activate.bat
python main.py- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Write tests for your changes
- Ensure all 125 tests pass:
python -m pytest test_suite.py - Commit with conventional format:
feat(scope): description - Push to your branch:
git push origin feature/amazing-feature - Open a Pull Request
type(scope): Brief description
Types: feat, fix, docs, refactor, test, perf
Examples:
- feat(balancer): Add weighted round robin
- fix(gui): Resolve chart rendering issue
- test(dqn): Add edge case tests
This project is licensed under the MIT License — see the LICENSE file for details.