A research-driven machine learning project derived from my master's thesis, focusing on adaptive client selection strategies in multi-server federated learning under mobility-induced unavailability.
⚠️ Note
This repository contains a simplified and partially adapted implementation.
The full thesis is currently under embargo for potential publication and is not publicly available.
In real-world federated learning systems, client participation is often unstable due to mobility, network conditions, or availability constraints.
Such dynamics introduce:
- Non-stationary client populations
- Shifting data distributions
- Unstable training behavior
- Degraded model performance
This project investigates how adaptive client mixing strategies can improve training robustness under these conditions.
Traditional federated learning assumes relatively stable client participation.
However, in multi-server environments:
- Clients may move across servers
- Participation may be intermittent
- Effective data distribution changes over time
This leads to:
❗ Fixed client selection strategies becoming suboptimal or unstable
We model client composition as a controllable variable and optimize it dynamically.
-
Client Partitioning
- Local Clients → stable participants
- Visitor Clients → dynamic / unstable participants
-
Mobility Modeling
- Represented as mobility-induced unavailability
- Modeled as an environmental prior
-
Decision Variable
- p = proportion of visitor clients
-
Optimization Strategy
- Bayesian Optimization (BO)
- Treated as a noisy black-box optimization problem
Mobility Indicators
↓
Unavailability Proxy
↓
Client Pool (Local / Visitor)
↓
Mixing Ratio p
↓
Short-run Evaluation Signal
↓
Bayesian Optimization
↓
Optimal Ratio p*
↓
Federated Training Strategy
Although this repository provides a simplified implementation, the underlying research observes:
- Optimal mixing ratio depends on environment conditions
- Higher visitor ratios improve generalization in stable settings
- Lower visitor ratios improve stability under high unavailability
- Short-run signals can approximate long-run training performance
These findings are derived from controlled experiments in the original thesis.
The optimal mixing ratio varies under different client unavailability conditions.
- Low unavailability (10%): Higher visitor ratio performs better
- Medium unavailability (50%): Moderate behavior
- High unavailability (90%): Lower visitor ratio is more stable
The demo also illustrates how the optimal mixing ratio is identified through an iterative search process.
This repository focuses on the decision-making layer rather than full FL infrastructure.
-
src/simulator.py
Simulates environment-dependent behavior -
src/objective.py
Defines reward using short-run signals -
src/optimizer.py
Implements simplified Bayesian Optimization loop
Run the demo:
python demo/run_demo.py
This demonstrates:
- Ratio adaptation under different environments
- BO-based decision process
demo/ → runnable demo
src/ → core modules
docs/ → method explanation
figures/ → diagrams / results
- Python 3.10+
- Designed for conceptual reproducibility
- Not intended to fully replicate large-scale FL
This project demonstrates:
- Federated learning system modeling
- Optimization under uncertainty
- Simulation-based evaluation
- Research-to-engineering translation
- No full FL training pipeline (e.g., Flower)
- Uses simplified simulation
- Focuses on decision layer
See:
docs/method.md
This work is based on my master's thesis on:
Adaptive client mixing in multi-server federated learning using Bayesian Optimization
The full thesis is currently under embargo and will be publicly available in the future.
Machine Learning Engineer (Entry-level)
Focus: Federated Learning / Optimization / AI Systems
Keywords: Federated Learning, Bayesian Optimization, Non-IID, Distributed Systems



