The Micromouse Maze Solver is a Reinforcement Learning (RL) project developed for a midterm assignment in a Reinforcement Learning course. It simulates a virtual Micromouse navigating a randomly generated 20x20 maze to find the optimal path from a start position (1, 1) to a goal (18, 18). The project uses Q-learning, a model-free RL algorithm, to train an agent that learns through trial and error, balancing exploration and exploitation to achieve near-optimal performance. A recursive backtracking algorithm generates solvable mazes, ensuring a valid path exists. The project includes visualization of the maze and agent’s path, training progress plots, and a Breadth-First Search (BFS) baseline for optimality comparison. This implementation demonstrates RL’s application to robotics-inspired problems and lays the foundation for future generalization to arbitrary mazes using Deep Q-Networks (DQN).
- Objective: Train an RL agent to navigate a 20x20 maze efficiently using Q-learning.
- Environment: A 20x20 grid maze generated with recursive backtracking, with states (400 cells), actions (up, down, left, right), and rewards (+100 for goal, -10 for walls, -1 per step).
- Algorithm: Q-learning with epsilon-greedy exploration (decaying from 1.0 to 0.01 over 1000 episodes).
- Evaluation: The agent achieves an average of 70.4 steps and 30.6 reward over 10 test trials, closely matching the BFS optimal path of 70 steps.
- Key Features:
- Random maze generation for varied testing.
- Visualization of maze, training progress, and test paths.
- BFS comparison for performance validation.
- Python 3.7+
- Libraries:
numpy(for array operations)matplotlib(for visualization)random(included in Python standard library)
Install dependencies using:
pip install numpy matplotlib-
Clone the repository:
git clone https://github.com/[Your-GitHub-Username]/micromouse-rl.git cd micromouse-rl -
Ensure Python and required libraries are installed (see Requirements).
-
(Optional) Create a virtual environment:
python -m venv gymenv source gymenv/bin/activate # On Windows: gymenv\Scripts\activate pip install numpy matplotlib
-
Run the main script to generate a maze, train the agent, and test it:
python ql-mouse.py
-
The script will:
- Generate a random 20x20 maze.
- Display the initial maze.
- Train the agent for 1000 episodes, showing progress every 100 episodes.
- Plot training reward and step progress.
- Test the agent over 10 trials, displaying the first path and metrics.
- Compare results to the BFS optimal path.
-
Output includes:
- Maze visualizations (initial and test path).
- Training plots (rewards and steps).
- Test metrics: average steps (70.4), average reward (30.6), BFS optimal (70).
ql-mouse.py: Main script containing maze generation, Q-learning agent, training, testing, and BFS baseline.README.md: This file, providing project overview and instructions.
- Training: Over 1000 episodes, the agent improves from -800 reward and 800 steps (random exploration) to 31 reward and 70 steps, showing convergence.
- Testing: Across 10 trials, the agent averages 70.4 steps and 30.6 reward, with steps ranging from 70 to 72, closely matching the BFS optimal of 70 steps.
- Visualization: Matplotlib plots show the maze, training progress, and the agent’s path from (1, 1) to (18, 18).
- Maze-Specificity: Q-learning’s tabular approach ties the agent to a single maze; a new layout requires retraining.
- Slow Initial Learning: Early episodes hit -800 reward due to the 800-step limit, reflecting the challenge of sparse rewards in a 20x20 maze.
The project will be extended for the final course assignment to train a generalizable agent capable of solving any arbitrary maze without retraining. This will involve:
- Implementing a Deep Q-Network (DQN) to learn a policy across diverse maze layouts.
- Training on a variety of random mazes to enable adaptation to unseen configurations.
- Exploring adaptive reward structures (e.g., distance-based rewards) to enhance efficiency.
- Inspired by Micromouse competitions and RL concepts from Sutton & Barto’s Reinforcement Learning: An Introduction.
- Maze generation adapted from recursive backtracking algorithms in Cormen et al., Introduction to Algorithms.