Skip to content

ironLegion88/micromouse-rl

Repository files navigation

Micromouse Maze Solver

About

The Micromouse Maze Solver is a Reinforcement Learning (RL) project developed for a midterm assignment in a Reinforcement Learning course. It simulates a virtual Micromouse navigating a randomly generated 20x20 maze to find the optimal path from a start position (1, 1) to a goal (18, 18). The project uses Q-learning, a model-free RL algorithm, to train an agent that learns through trial and error, balancing exploration and exploitation to achieve near-optimal performance. A recursive backtracking algorithm generates solvable mazes, ensuring a valid path exists. The project includes visualization of the maze and agent’s path, training progress plots, and a Breadth-First Search (BFS) baseline for optimality comparison. This implementation demonstrates RL’s application to robotics-inspired problems and lays the foundation for future generalization to arbitrary mazes using Deep Q-Networks (DQN).

Project Overview

  • Objective: Train an RL agent to navigate a 20x20 maze efficiently using Q-learning.
  • Environment: A 20x20 grid maze generated with recursive backtracking, with states (400 cells), actions (up, down, left, right), and rewards (+100 for goal, -10 for walls, -1 per step).
  • Algorithm: Q-learning with epsilon-greedy exploration (decaying from 1.0 to 0.01 over 1000 episodes).
  • Evaluation: The agent achieves an average of 70.4 steps and 30.6 reward over 10 test trials, closely matching the BFS optimal path of 70 steps.
  • Key Features:
    • Random maze generation for varied testing.
    • Visualization of maze, training progress, and test paths.
    • BFS comparison for performance validation.

Requirements

  • Python 3.7+
  • Libraries:
    • numpy (for array operations)
    • matplotlib (for visualization)
    • random (included in Python standard library)

Install dependencies using:

pip install numpy matplotlib

Installation

  1. Clone the repository:

    git clone https://github.com/[Your-GitHub-Username]/micromouse-rl.git
    cd micromouse-rl
  2. Ensure Python and required libraries are installed (see Requirements).

  3. (Optional) Create a virtual environment:

    python -m venv gymenv
    source gymenv/bin/activate  # On Windows: gymenv\Scripts\activate
    pip install numpy matplotlib

Usage

  1. Run the main script to generate a maze, train the agent, and test it:

    python ql-mouse.py
  2. The script will:

    • Generate a random 20x20 maze.
    • Display the initial maze.
    • Train the agent for 1000 episodes, showing progress every 100 episodes.
    • Plot training reward and step progress.
    • Test the agent over 10 trials, displaying the first path and metrics.
    • Compare results to the BFS optimal path.
  3. Output includes:

    • Maze visualizations (initial and test path).
    • Training plots (rewards and steps).
    • Test metrics: average steps (70.4), average reward (30.6), BFS optimal (70).

Project Structure

  • ql-mouse.py: Main script containing maze generation, Q-learning agent, training, testing, and BFS baseline.
  • README.md: This file, providing project overview and instructions.

Results

  • Training: Over 1000 episodes, the agent improves from -800 reward and 800 steps (random exploration) to 31 reward and 70 steps, showing convergence.
  • Testing: Across 10 trials, the agent averages 70.4 steps and 30.6 reward, with steps ranging from 70 to 72, closely matching the BFS optimal of 70 steps.
  • Visualization: Matplotlib plots show the maze, training progress, and the agent’s path from (1, 1) to (18, 18).

Limitations

  • Maze-Specificity: Q-learning’s tabular approach ties the agent to a single maze; a new layout requires retraining.
  • Slow Initial Learning: Early episodes hit -800 reward due to the 800-step limit, reflecting the challenge of sparse rewards in a 20x20 maze.

Future Work

The project will be extended for the final course assignment to train a generalizable agent capable of solving any arbitrary maze without retraining. This will involve:

  • Implementing a Deep Q-Network (DQN) to learn a policy across diverse maze layouts.
  • Training on a variety of random mazes to enable adaptation to unseen configurations.
  • Exploring adaptive reward structures (e.g., distance-based rewards) to enhance efficiency.

Acknowledgments

  • Inspired by Micromouse competitions and RL concepts from Sutton & Barto’s Reinforcement Learning: An Introduction.
  • Maze generation adapted from recursive backtracking algorithms in Cormen et al., Introduction to Algorithms.

About

This repository implements a virtual Micromouse that uses Q-learning, a Reinforcement Learning algorithm, to navigate a randomly generated 20x20 maze.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages