🚀 Dynamic Load Balancing in Multiprocessor Systems

A production-grade simulator for dynamic load balancing algorithms with AI-powered optimization

Quick Start • Features • Algorithms • Documentation • Contributing

🎬 Quick Start

# Clone the repository
git clone https://github.com/Auankj/dynamic_load_balancer.git
cd dynamic_load_balancer

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Launch the GUI
python main.py

🖼️ Screenshot Preview

The GUI features:

Real-time processor load visualization with color-coded bars
Interactive Gantt chart showing process execution timeline
Live metrics dashboard with performance statistics
Algorithm comparison with side-by-side analysis

🎯 Overview

Load balancing is a critical operating system technique that distributes workloads across multiple processors to maximize efficiency. This simulator provides:

Goal	Description
🚀 Maximize Throughput	Complete more work in less time
⚡ Minimize Response Time	Users get faster responses
⚖️ Optimize Utilization	All processors stay busy
🛡️ Prevent Bottlenecks	No single processor gets overwhelmed

✨ Features

🎮 Core Simulation Engine

Feature	Description
Multi-Processor	Configure 2-16 virtual processors with customizable speed
Process Types	CPU-bound, I/O-bound, Real-time, Batch, Interactive
Workload Patterns	Uniform, Bursty, Poisson, Diurnal, Spike, Wave
13 Algorithms	Load Balancing + Classic CPU Scheduling
AI-Powered	Deep reinforcement learning with PyTorch (GPU accelerated)
Process Migration	Dynamic load rebalancing across processors

🤖 AI Load Balancing

Feature	Description
Q-Learning	Discrete state-space reinforcement learning
Deep Q-Network (DQN)	Neural network with experience replay
Double DQN	Reduced overestimation bias
Prioritized Replay	Focus on important experiences
Model Persistence	Save/load trained models automatically

📊 Advanced Simulation

Feature	Description
Process Types	CPU_BOUND, IO_BOUND, MIXED, REAL_TIME, BATCH, INTERACTIVE
Workload Patterns	UNIFORM, BURSTY, POISSON, DIURNAL, SPIKE, GRADUAL_RAMP, WAVE
Advanced Processors	Multi-level feedback queue, cache simulation, power states
Scenario System	Predefined and custom simulation scenarios
SLA Tracking	Service Level Agreement metrics and violations

🎨 Rich Visualization & Charts

📊 Processor Load Monitor

Real-time visualization of processor loads with intelligent color-coding:

   0%                    50%                   100%
   ├───────────────────────┼───────────────────────┤
   🟢 Green (0-40%)        🟡 Yellow (40-70%)       🔴 Red (70-100%)

Load Bars: Animated horizontal bars showing current CPU utilization
Color Gradient: Smooth transitions between load states
Queue Indicators: Number badges showing pending processes per CPU

📈 Gantt Chart Timeline

Interactive process execution timeline with rich details:

Process Blocks: Color-coded by process priority/type
Time Scale: Auto-adjusting based on simulation duration
Hover Details: Process ID, burst time, start/end times
Migration Lines: Dashed connectors showing process migrations
Zoom & Pan: Mouse wheel zoom, drag to navigate

📉 Real-Time Performance Graphs

Live updating charts with multiple metrics:

Metric Type	Visualization	Update Frequency
CPU Utilization	Line chart with fill	Every simulation tick
Throughput	Stacked area chart	Per completed process
Wait Times	Box plot with outliers	Per process completion
Queue Depths	Multi-line overlay	Every scheduling decision
Load Variance	Gradient heat strip	Continuous

⚖️ Algorithm Comparison Dashboard

Side-by-side performance analysis for all 13 algorithms:

Grouped Bar Charts: Compare avg wait, turnaround, response times
Radar Charts: Multi-dimensional algorithm profiling
Ranking Tables: Sortable by any metric
Export Options: PNG, SVG, CSV data export

🎯 Additional Visualizations

Chart Type	Description
🔄 Queue Visualization	Ready queue depth per processor with process IDs
🎯 Fairness Index	Jain's fairness index dial (0.0 to 1.0)
⏱️ Response Time Distribution	Histogram with percentile markers (p50, p95, p99)
🔥 Utilization Heatmap	Time × Processor matrix showing load intensity
📊 Process Timeline	Individual process lifecycle from arrival to completion
🌡️ System Temperature	Overall system load gauge with thresholds

📈 Comprehensive Analytics

Process Metrics — Turnaround time, waiting time, response time
Processor Metrics — CPU utilization, queue length, throughput
System Metrics — Load variance, Jain's fairness index, migrations
Data Export — JSON and CSV export for external analysis

🏗️ Architecture

┌────────────────────────────────────────────────────────────────────┐
│                           GUI Layer                                │
│                          (gui.py)                                  │
│   ┌────────────┐ ┌────────────┐ ┌────────────┐ ┌────────────────┐  │
│   │ Load Bars  │ │  Metrics   │ │   Charts   │ │   Controls     │  │
│   └────────────┘ └────────────┘ └────────────┘ └────────────────┘  │
└────────────────────────────────┬───────────────────────────────────┘
                                 │
┌────────────────────────────────▼───────────────────────────────────┐
│                       Simulation Layer                             │
│           (simulation.py / enhanced_simulation.py)                 │
│   ┌────────────────────────────────────────────────────────────┐   │
│   │                    SimulationEngine                        │   │
│   │    Time Management • Event Processing • State Control      │   │
│   └────────────────────────────────────────────────────────────┘   │
└──────┬──────────────────┬───────────────────┬──────────────────────┘
       │                  │                   │
┌──────▼──────┐   ┌───────▼───────┐   ┌───────▼───────┐
│Load Balancer│   │   Processor   │   │    Metrics    │
│             │   │               │   │               │
│• RoundRobin │   │• Execution    │   │• Process      │
│• LeastLoaded│   │• Queue Mgmt   │   │• Processor    │
│• Threshold  │   │• Migration    │   │• System       │
│• Q-Learning │   │• Power States │   │• SLA Tracking │
│• DQN        │   │• Cache Sim    │   │               │
└──────┬──────┘   └───────┬───────┘   └───────────────┘
       │                  │
┌──────▼──────────────────▼──────┐
│          Core Layer            │
│   config.py • process.py       │
│   utils.py • validators.py     │
│   advanced_simulation.py       │
│   integration.py               │
└────────────────────────────────┘

Design Patterns

Pattern	Implementation	Purpose
Strategy	LoadBalancer ABC	Swappable algorithms
Factory	LoadBalancerFactory	Algorithm instantiation
Observer	GUI callbacks	Real-time updates
Builder	ScenarioBuilder	Custom scenario creation
Singleton	SimulationLogger	Centralized logging

⚖️ Load Balancing Algorithms

Quick Comparison

Algorithm	Speed	Balance	Adaptability	Best For
Round Robin	⭐⭐⭐	⭐	⭐	Uniform workloads
Least Loaded	⭐⭐	⭐⭐⭐	⭐⭐	Variable workloads
Threshold	⭐⭐	⭐⭐⭐	⭐⭐⭐	Dynamic environments
Q-Learning	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐	Pattern-rich workloads
DQN	⭐⭐	⭐⭐⭐	⭐⭐⭐	Complex continuous states

1. Round Robin

Distributes processes cyclically: P0→P1→P2→P3→P0...

def assign(self, process, processors):
    target = processors[self.current_index]
    self.current_index = (self.current_index + 1) % len(processors)
    return target

✅ Simple, predictable, zero overhead
❌ Ignores actual load, can create imbalance

2. Least Loaded First

Assigns to the processor with the lowest current load

def assign(self, process, processors):
    return min(processors, key=lambda p: p.current_load)

✅ Optimal distribution, adapts to state
❌ O(n) per assignment, monitoring overhead

3. Threshold-Based

Migrates processes when load difference exceeds threshold

def check_balance(self, processors):
    loads = [p.current_load for p in processors]
    if max(loads) - min(loads) > self.threshold:
        self.migrate_process(overloaded, underloaded)

✅ Dynamic rebalancing, prevents severe imbalance
❌ Migration has cost, needs threshold tuning

4. Q-Learning (AI)

Learns optimal assignments through reinforcement learning

def assign(self, process, processors):
    state = self.encode_state(processors, process)
    if self.training and random() < self.epsilon:
        action = random_choice(len(processors))  # Explore
    else:
        action = argmax(self.Q[state])           # Exploit
    return processors[action]

✅ Learns optimal strategy, improves over time
❌ Needs training, initial random behavior

5. Deep Q-Network (DQN)

Neural network approximates Q-function for continuous states

class DQNetwork(nn.Module):
    def __init__(self, state_dim, action_dim):
        self.fc1 = nn.Linear(state_dim, 128)
        self.fc2 = nn.Linear(128, 256)
        self.fc3 = nn.Linear(256, 128)
        self.out = nn.Linear(128, action_dim)

✅ Handles continuous states, excellent generalization
❌ Requires PyTorch, more computationally intensive

🎓 AI Training Guide

Mode	Exploration (ε)	Purpose	When to Use
Train	100% → 5%	Learn optimal strategies	First runs, new patterns
Exploit	Fixed 1%	Use learned knowledge	After training complete

Recommended Training:

Q-Learning: 500-2000+ process assignments
DQN: 1000-5000+ process assignments

📚 Classic CPU Scheduling Algorithms

The simulator also includes 8 classic CPU scheduling algorithms that every OS student should know!

Algorithm Comparison Table

Algorithm	Type	Preemptive	Optimal For	Starvation
FCFS	📋 FIFO	❌	Simplicity	❌
SJF	⏱️ Burst	❌	Avg Wait Time	⚠️
SRTF	⏱️ Burst	✅	Response Time	⚠️
Priority	🎯 Priority	❌	Critical Tasks	⚠️
Priority (P)	🎯 Priority	✅	Urgent Tasks	⚠️
Multilevel Queue	📊 Class	✅	Mixed Workloads	⚠️
MLFQ	🧠 Adaptive	✅	General Purpose	❌
EDF	⏰ Deadline	✅	Real-Time	❌

1️⃣ FCFS – First Come First Served

The OG of schedulers. Whoever comes first gets the CPU first.

def select_next(self):
    return sorted(self.queue, key=lambda p: p.arrival_time)[0]

✅ Simple, no starvation, minimal overhead
❌ Convoy effect — one big process blocks everyone

2️⃣ SJF – Shortest Job First

Picks the process with the shortest burst time. Productivity king! 👑

def select_next(self):
    return min(self.queue, key=lambda p: p.burst_time)

✅ Optimal average waiting time (provably!)
❌ Long jobs may starve, needs burst time prediction

3️⃣ SRTF – Shortest Remaining Time First

The chaotic younger sibling of SJF. Preemptive version!

def should_preempt(self, new_process):
    return new_process.remaining_time < self.current.remaining_time

✅ Even better response time than SJF
❌ High overhead, long jobs get ghosted constantly 👻

4️⃣ Priority Scheduling

CPU goes to the highest priority process. VIP treatment! 🎖️

def select_next(self):
    # Uses aging to prevent starvation
    for p in self.queue:
        p.effective_priority = p.priority - p.wait_time // 10
    return min(self.queue, key=lambda p: p.effective_priority)

✅ Critical processes get attention, flexible
❌ Can starve low priority (solved with aging!)

5️⃣ Round Robin (RR)

Every process gets a time slice (quantum). Fair & democratic! 🗳️

def tick(self):
    self.quantum_remaining -= 1
    if self.quantum_remaining <= 0:
        self.queue.append(self.current)  # Back of queue
        self.current = self.queue.popleft()

✅ Fair, good response time, no starvation
❌ Quantum too small = too many switches, too big = becomes FCFS

6️⃣ Multilevel Queue Scheduling

Think of it like a school with different sections! 🏫

┌─────────────────────────────────────┐
│ Queue 0: System Processes (RR, q=8) │ ← Highest Priority
├─────────────────────────────────────┤
│ Queue 1: Interactive (RR, q=4)      │
├─────────────────────────────────────┤
│ Queue 2: Batch Jobs (FCFS)          │
├─────────────────────────────────────┤
│ Queue 3: Idle Processes             │ ← Lowest Priority
└─────────────────────────────────────┘

✅ Good for categorized workloads, optimizes each queue
❌ Inflexible — processes stuck in their queue

7️⃣ MLFQ – Multilevel Feedback Queue

The genius, adaptive version. Processes can MOVE between queues! 🧠

The Rules:

🆕 New processes start at top queue (highest priority)
⬇️ Use full quantum? Get demoted to lower queue
⬆️ Give up CPU early (I/O)? Stay or get promoted
🔄 Periodic priority boost prevents starvation

def after_quantum(self, process):
    if process.used_full_quantum:
        self.demote(process)  # CPU hog detected!
    else:
        self.promote(process)  # Nice I/O-bound process

✅ Adapts automatically, no prior knowledge needed
❌ Complex to tune, can be gamed

8️⃣ EDF – Earliest Deadline First

For real-time systems. Nearest deadline = gets the CPU! ⏰

def select_next(self):
    return min(self.queue, key=lambda p: p.deadline)

✅ Optimal for single processor real-time (can achieve 100% utilization!)
❌ Deadline miss cascade — one miss can cause many

📁 Project Structure

dynamic_load_balancer/
├── 🎯 Core Modules
│   ├── main.py                 # Application entry point
│   ├── config.py               # Configuration and constants
│   ├── process.py              # Process model and generator
│   ├── processor.py            # Processor execution logic
│   └── simulation.py           # Standard simulation engine
│
├── 🤖 AI & Scheduling Modules
│   ├── load_balancer.py        # Algorithm implementations
│   ├── ai_balancer.py          # Q-Learning balancer
│   ├── dqn_balancer.py         # Deep Q-Network balancer
│   └── scheduling_algorithms.py # Classic CPU schedulers (FCFS, SJF, etc.)
│
├── 🚀 Advanced Simulation
│   ├── advanced_simulation.py  # Enhanced process/processor models
│   ├── enhanced_simulation.py  # Production-grade engine
│   └── integration.py          # Scenario management
│
├── 📊 Support Modules
│   ├── metrics.py              # Performance metrics
│   ├── gui.py                  # Tkinter GUI
│   ├── utils.py                # Logging and export
│   └── validators.py           # Input validation
│
├── 🧪 Testing
│   └── test_suite.py           # 133 comprehensive tests
│
└── 📄 Documentation
    ├── README.md               # This file
    └── requirements.txt        # Dependencies

Module Overview

Module	Purpose	Key Classes
`config.py`	Configuration	SimulationConfig, LoadBalancingAlgorithm
`process.py`	Process model	Process, ProcessGenerator
`processor.py`	Execution	Processor, ProcessorManager
`load_balancer.py`	Algorithms	RoundRobin, LeastLoaded, Threshold
`ai_balancer.py`	Q-Learning	QLearningAgent, StateEncoder
`dqn_balancer.py`	Deep RL	DQNAgent, DQNetwork, PrioritizedReplay
`scheduling_algorithms.py`	CPU Scheduling	FCFS, SJF, SRTF, Priority, MLFQ, EDF
`advanced_simulation.py`	Advanced models	AdvancedProcess, AdvancedProcessor
`enhanced_simulation.py`	Production engine	EnhancedSimulationEngine
`integration.py`	Scenarios	ScenarioBuilder, PerformanceAnalyzer

🎮 Predefined Scenarios

Scenario	Processors	Processes	Pattern	Description
Basic	4	20	Uniform	Standard simulation
CPU Intensive	8	30	Uniform	Long-running computation
I/O Intensive	4	40	Bursty	Frequent blocking
Mixed Workload	6	50	Diurnal	Real-world simulation
Bursty Traffic	4	60	Spike	Sudden load spikes
Real-Time	8	25	Uniform	Strict deadlines
Stress Test	4	100	Spike	Maximum load testing

📊 Performance Metrics

Key Metrics

Metric	Formula	Description
Turnaround Time	`completion - arrival`	Total time in system
Waiting Time	`start - arrival`	Time in queue
Response Time	`first_run - arrival`	Time to first execution
CPU Utilization	`busy_time / total_time`	Processor efficiency
Throughput	`completed / time`	Processes per time unit
Jain's Fairness	`(Σx)² / (n × Σx²)`	Load distribution fairness

Jain's Fairness Index

$$J(x_1, x_2, ..., x_n) = \frac{(\sum_{i=1}^{n} x_i)^2}{n \cdot \sum_{i=1}^{n} x_i^2}$$

J = 1.0: Perfect fairness (equal load)
J = 1/n: Worst case (all load on one processor)

🧪 Testing

# Run all 125 tests
python -m pytest test_suite.py -v

# Run specific test class
python -m pytest test_suite.py::TestDQNBalancer -v

# Run with coverage
python -m pytest test_suite.py --cov=. --cov-report=html

Test Categories

Category	Tests	Coverage
Configuration	6	Config creation, defaults
Process Model	12	Lifecycle, state transitions
Processor	14	Execution, queue management
Load Balancers	20	All algorithm correctness
Q-Learning	15	Agent training, inference
DQN	20	Neural network, replay buffer
Simulation	12	Engine initialization, execution
Metrics	13	Calculations, edge cases
Integration	6	End-to-end workflows
Edge Cases	7	Boundary conditions

📚 API Reference

Quick Examples

# Create and run simulation
from simulation import SimulationEngine
from config import SimulationConfig, LoadBalancingAlgorithm

config = SimulationConfig(num_processors=4, num_processes=20)
engine = SimulationEngine(config)
engine.initialize(algorithm=LoadBalancingAlgorithm.DQN)

while not engine.is_complete():
    engine.step()

result = engine.get_result()
print(f"Avg Turnaround: {result.system_metrics.avg_turnaround_time:.2f}")

# Use scenario builder
from integration import ScenarioBuilder, IntegratedSimulationManager
from advanced_simulation import WorkloadPattern, ProcessType

scenario = (ScenarioBuilder("Custom Test")
    .with_processors(8)
    .with_processes(50)
    .with_workload(WorkloadPattern.BURSTY)
    .with_algorithm(LoadBalancingAlgorithm.DQN)
    .build())

manager = IntegratedSimulationManager(use_enhanced=True)
manager.load_scenario(scenario)
manager.initialize()
manager.start()

⚙️ Configuration

SimulationConfig Options

Parameter	Default	Range	Description
`num_processors`	4	2-16	Number of processors
`num_processes`	20	1-100	Processes to generate
`time_quantum`	4	1-20	Round robin time slice
`min_burst_time`	1	1-100	Minimum burst time
`max_burst_time`	20	1-1000	Maximum burst time
`migration_threshold`	0.3	0.0-1.0	Load diff for migration

💻 Platform Notes

macOS

source venv/bin/activate
python main.py

Windows (PowerShell)

.\venv\Scripts\Activate.ps1
python main.py

Windows (Command Prompt)

venv\Scripts\activate.bat
python main.py

🤝 Contributing

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Write tests for your changes
Ensure all 125 tests pass: python -m pytest test_suite.py
Commit with conventional format: feat(scope): description
Push to your branch: git push origin feature/amazing-feature
Open a Pull Request

Commit Format

type(scope): Brief description

Types: feat, fix, docs, refactor, test, perf

Examples:
- feat(balancer): Add weighted round robin
- fix(gui): Resolve chart rendering issue
- test(dqn): Add edge case tests

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.

Made with ❤️ for learning Operating Systems concepts

v2.0.0 • 125 Tests Passing • Python 3.8+ • PyTorch 2.0+

FilesExpand file tree

README.md

Latest commit

History