This project explores Hyperparameter Optimization (HPO) through the lens of sequential decision-making, addressing the inefficiencies of traditional methods like Grid Search and Random Search. By applying Bayesian Optimization and Q-learning, the project demonstrates how sequential methods can achieve superior performance with fewer evaluations, providing insights into balancing exploration and exploitation.
- Sequential Decision-Making Framework:
- Framed HPO as a Markov Decision Process (MDP) with states, actions, transitions, and rewards.
- Compared sequential methods (Bayesian Optimization, Q-learning) to traditional non-sequential methods (Grid Search, Random Search).
- Bayesian Optimization:
- Utilized Gaussian Processes to model hyperparameter performance and guide sampling.
- Implemented exploration methods like Expected Improvement (EI), Upper Confidence Bound (UCB), and greedy approaches.
- Q-Learning:
- Applied model-free reinforcement learning with an ε-greedy strategy and reward-based feedback.
- Balanced exploration and exploitation to optimize hyperparameters iteratively.
- Experimental Setups:
- Conducted experiments across three setups:
- Decision Tree on a Digits Dataset.
- Self-defined black-box objective function for continuous spaces.
- SVM on Kaggle's Application Dataset.
- Benchmarked algorithms on performance and computational efficiency.
- Conducted experiments across three setups:
- Bayesian Optimization outperformed Grid Search and Random Search in terms of accuracy and efficiency in most setups.
- Q-Learning showed promise but exhibited variability and instability due to sparse state-action exploration.
- Analysis of exploration methods highlighted UCB's faster convergence compared to EI and the limitations of greedy strategies in local maxima scenarios.
- Programming: Python
- Algorithms: Bayesian Optimization, Q-Learning
- Libraries: Scikit-learn, NumPy, Pandas
- Visualization: Matplotlib, Seaborn
- Clone the repository:
git clone https://github.com/yourusername/hpo-sequential.git