Skip to content

rubensolv/InnateCoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

InnateCoder

InnateCoder: Learning Programmatic Options with Foundation Models

This repository contains the code and data for the paper "InnateCoder: Learning Programmatic Options with Foundation Models" by Rubens O. Moraes, Quazi Sadmine, Hendrik Baier, and Levi Lelis, accepted on IJCAI 2025.

Overview

InnateCoder is a system that uses foundation models to give reinforcement learning agents "innate skills" before they start interacting with their environment. Unlike traditional reinforcement learning approaches where agents start from scratch, InnateCoder provides programmatic options (reusable skills) that make learning more efficient.

How It Works

InnateCoder has three main components:

  1. Learning Options: Uses a foundation model to generate programmatic policies, which are broken down into smaller "options" (reusable skills).

  2. Semantic Space: Creates a meaningful space of programs where similar behaviors are grouped together.

  3. Local Search: Searches through both syntax and semantic spaces to find optimal policies.

Key Benefits

  • More sample-efficient than systems without pre-learned options
  • Leverages human knowledge encoded in foundation models
  • Generates options in a zero-shot manner (before environment interaction)
  • Cost-effective as it only uses the foundation model as a pre-processing step
  • Accessible to smaller labs and companies

Implementation

InnateCoder represents policies as programs written in domain-specific languages (DSLs). Each program receives a state and returns an action for the agent to take. The system was tested in two domains:

  • MicroRTS: A challenging real-time strategy game
  • Karel the Robot: A benchmark for program synthesis and reinforcement learning

Results

Experiments show that InnateCoder is more sample-efficient than versions without options or with options learned from experience. The policies learned are competitive with and often outperform state-of-the-art algorithms.

Repository Information

The complete implementation is available at: https://github.com/rubensolv/InnateCoder

Technical Approach

InnateCoder improves on traditional search methods by exploring the "semantic space" of programs rather than just the "syntax space." This means it focuses on program behavior rather than just structure, allowing for more effective search and better results.

Karel

Karel is a simple programming environment for learning the basics of programming. It provides a set of commands for controlling a robot in a grid world. We used as a benchmark for our system to evaluate the performance of the programmatic options generated by InnateCoder. The main repository used as a base is the Karel repository can be found at: https://github.com/lelis-research/prog_policies.

Dictionaries of LLM-Generated Programs

In the datasets/dictionaries folder, you can find at least one example of a dictionary generated with Large Language Models (LLMs) for each map used in the experiments. These dictionaries contain programmatic options that serve as the foundation for InnateCoder's approach, demonstrating how foundation models can generate useful program fragments without any prior interaction with the environment. Each dictionary includes a diverse set of programs that capture different behaviors and strategies relevant to solving tasks in the Karel domain.

Stochastic Hill Climbing with LLM Dictionary

The stochastic_hill_climbing_LLM_dict.py file implements a specialized search algorithm that combines stochastic hill climbing with a dictionary of programs generated by a Large Language Model (LLM). This approach represents a core component of InnateCoder's implementation for the Karel domain.

Key features of this implementation:

  1. Dictionary-Based Mutation: Instead of relying solely on random mutations, the algorithm leverages a dictionary of program fragments generated by an LLM. This allows the search to make more semantically meaningful changes to programs.

  2. Behavior-Based Dictionary Cleaning: The implementation includes a mechanism to clean the dictionary by removing programs that produce duplicate behaviors, ensuring diversity in the available options.

  3. Adaptive Search Strategy: The algorithm balances between exploiting the LLM-generated dictionary (80% of mutations) and traditional random mutations (20%), providing both guided and exploratory search capabilities.

  4. Local Maximum Escape: When stuck in a local maximum, the algorithm restarts with a completely new random program, allowing it to explore different regions of the program space.

This implementation demonstrates how InnateCoder effectively combines foundation model knowledge (through the LLM-generated dictionary) with traditional search techniques to find optimal programmatic policies more efficiently.

Dependencies

  • Python 3.8 or higher

To install the required dependencies, run:

pip install -r requirements.txt

For exact version requirements, please refer to the requirements.txt file.

Usage

Running Experiments

You can use the scripts in the scripts folder to run various experiments:

  1. Evaluate LLM-generated Solutions:

    python scripts/run_LLM_solutions_eval.py

    This script evaluates the performance of solutions directly generated by LLMs.

  2. Generate Solution Visualizations:

    python scripts/run_LLM_for_gif.py

    Creates animated GIFs showing how a particular solution executes in the Karel environment.

  3. Run Search Experiments:

    python scripts/run_search.py + parameters
    # or
    python scripts/run_search_new.py + parameters

    These scripts perform search-based experiments using parameters defined in their respective classes. Be sure to use StochasticHillClimbingLLMDict as the search algorithm parameter to leverage the LLM-generated dictionary of programmatic options.

For all scripts, you can modify the configuration parameters directly in the files to customize the experiments according to your needs.

MicroRTS

MicroRTS is a research platform for real-time strategy (RTS) games. It is designed to facilitate the development and evaluation of AI algorithms in a competitive gaming environment.

About

InnateCoder: Learning Programmatic Options with Foundation Models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •