|
1 | | -# Evolving Symbolic Regression with OpenEvolve on LLM-SRBench 🧬🔍 |
| 1 | +# OpenEvolve |
2 | 2 |
|
3 | | -This example demonstrates how **OpenEvolve** can be utilized to perform **symbolic regression** tasks using the **LLM-SRBench benchmark** (highlighted at ICML 2025). It showcases OpenEvolve's capability to evolve Python code, transforming simple mathematical expressions into more complex and accurate models that fit given datasets. |
| 3 | +An open-source implementation of the AlphaEvolve system described in the Google DeepMind paper "AlphaEvolve: A coding agent for scientific and algorithmic discovery" (2025). |
4 | 4 |
|
5 | | ------- |
| 5 | + |
6 | 6 |
|
7 | | -## 🎯 Problem Description: Symbolic Regression on LLM-SRBench |
| 7 | +## Overview |
8 | 8 |
|
9 | | -**Symbolic Regression** is the task of discovering a mathematical expression that best fits a given dataset. Unlike traditional regression techniques that optimize parameters for a predefined model structure, symbolic regression aims to find both the **structure of the model** and its **parameters**. |
| 9 | +OpenEvolve is an evolutionary coding agent that uses Large Language Models to optimize code through an iterative process. It orchestrates a pipeline of LLM-based code generation, evaluation, and selection to continuously improve programs for a variety of tasks. |
10 | 10 |
|
11 | | -This example leverages **LLM-SRBench**, a benchmark specifically designed for Large Language Model-based Symbolic Regression. The core objective is to use OpenEvolve to evolve an initial, often simple, model (e.g., a linear model) into a more sophisticated symbolic expression. This evolved expression should accurately capture the underlying relationships within various scientific datasets provided by the benchmark. |
| 11 | +Key features: |
| 12 | +- Evolution of entire code files, not just single functions |
| 13 | +- Support for multiple programming languages |
| 14 | +- Supports OpenAI-compatible APIs for any LLM |
| 15 | +- Multi-objective optimization |
| 16 | +- Flexible prompt engineering |
| 17 | +- Distributed evaluation |
12 | 18 |
|
13 | | ------- |
| 19 | +## How It Works |
14 | 20 |
|
15 | | -## 🚀 Getting Started |
| 21 | +OpenEvolve follows an evolutionary approach with the following components: |
16 | 22 |
|
17 | | -Follow these steps to set up and run the symbolic regression benchmark example: |
| 23 | + |
18 | 24 |
|
19 | | -### 1. Configure API Secrets |
| 25 | +1. **Prompt Sampler**: Creates context-rich prompts containing past programs, their scores, and problem descriptions |
| 26 | +2. **LLM Ensemble**: Generates code modifications via an ensemble of language models |
| 27 | +3. **Evaluator Pool**: Tests generated programs and assigns scores |
| 28 | +4. **Program Database**: Stores programs and their evaluation metrics, guiding future evolution |
20 | 29 |
|
21 | | -You'll need to provide your API credentials for the language models used by OpenEvolve. |
| 30 | +The controller orchestrates interactions between these components in an asynchronous pipeline, maximizing throughput to evaluate as many candidate solutions as possible. |
22 | 31 |
|
23 | | -- Create a `secrets.yaml` file in the example directory. |
24 | | -- Add your API key and model preferences: |
| 32 | +## Getting Started |
25 | 33 |
|
26 | | -YAML |
| 34 | +### Installation |
27 | 35 |
|
| 36 | +To install natively, use: |
| 37 | +```bash |
| 38 | +git clone https://github.com/codelion/openevolve.git |
| 39 | +cd openevolve |
| 40 | +pip install -e . |
28 | 41 | ``` |
29 | | -# secrets.yaml |
30 | | -api_key: <YOUR_OPENAI_API_KEY> |
31 | | -api_base: "https://api.openai.com/v1" # Or your custom endpoint |
32 | | -primary_model: "gpt-4o" |
33 | | -secondary_model: "o3" # Or another preferred model for specific tasks |
34 | | -``` |
35 | 42 |
|
36 | | -Replace `<YOUR_OPENAI_API_KEY>` with your actual OpenAI API key. |
| 43 | +### Quick Start |
| 44 | + |
| 45 | +```python |
| 46 | +from openevolve import OpenEvolve |
| 47 | + |
| 48 | +# Initialize the system |
| 49 | +evolve = OpenEvolve( |
| 50 | + initial_program_path="path/to/initial_program.py", |
| 51 | + evaluation_file="path/to/evaluator.py", |
| 52 | + config_path="path/to/config.yaml" |
| 53 | +) |
| 54 | + |
| 55 | +# Run the evolution |
| 56 | +best_program = await evolve.run(iterations=1000) |
| 57 | +print(f"Best program metrics:") |
| 58 | +for name, value in best_program.metrics.items(): |
| 59 | + print(f" {name}: {value:.4f}") |
| 60 | +``` |
37 | 61 |
|
38 | | -### 2. Load Benchmark Tasks & Generate Initial Programs |
| 62 | +### Command-Line Usage |
39 | 63 |
|
40 | | -The `data_api.py` script is crucial for setting up the environment. It prepares tasks from the LLM-SRBench dataset (defined by classes in `./bench`, and will be located at `./problems`). |
| 64 | +OpenEvolve can also be run from the command line: |
41 | 65 |
|
42 | | -For each benchmark task, this script will automatically generate: |
| 66 | +```bash |
| 67 | +python openevolve-run.py path/to/initial_program.py path/to/evaluator.py --config path/to/config.yaml --iterations 1000 |
| 68 | +``` |
43 | 69 |
|
44 | | -- `initial_program.py`: A starting Python program, typically a simple linear model. |
45 | | -- `evaluator.py`: A tailored evaluation script for the task. |
46 | | -- `config.yaml`: An OpenEvolve configuration file specific to the task. |
| 70 | +### Resuming from Checkpoints |
47 | 71 |
|
48 | | -Run the script from your terminal: |
| 72 | +OpenEvolve automatically saves checkpoints at intervals specified by the `checkpoint_interval` config parameter (default is 10 iterations). You can resume an evolution run from a saved checkpoint: |
49 | 73 |
|
50 | 74 | ```bash |
51 | | -python data_api.py |
| 75 | +python openevolve-run.py path/to/initial_program.py path/to/evaluator.py \ |
| 76 | + --config path/to/config.yaml \ |
| 77 | + --checkpoint path/to/checkpoint_directory \ |
| 78 | + --iterations 50 |
52 | 79 | ``` |
53 | 80 |
|
54 | | -This will create subdirectories for each benchmark task, populated with the necessary files. |
55 | | - |
56 | | -### 3. Run OpenEvolve |
| 81 | +When resuming from a checkpoint: |
| 82 | +- The system loads all previously evolved programs and their metrics |
| 83 | +- Checkpoint numbering continues from where it left off (e.g., if loaded from checkpoint_50, the next checkpoint will be checkpoint_60) |
| 84 | +- All evolution state is preserved (best programs, feature maps, archives, etc.) |
| 85 | +- Each checkpoint directory contains a copy of the best program at that point in time |
57 | 86 |
|
58 | | -Use the provided shell script `scripts.sh` to execute OpenEvolve across the generated benchmark tasks. This script iterates through the task-specific configurations and applies the evolutionary process. |
| 87 | +Example workflow with checkpoints: |
59 | 88 |
|
60 | 89 | ```bash |
61 | | -bash scripts.sh |
| 90 | +# Run for 50 iterations (creates checkpoints at iterations 10, 20, 30, 40, 50) |
| 91 | +python openevolve-run.py examples/function_minimization/initial_program.py \ |
| 92 | + examples/function_minimization/evaluator.py \ |
| 93 | + --iterations 50 |
| 94 | + |
| 95 | +# Resume from checkpoint 50 for another 50 iterations (creates checkpoints at 60, 70, 80, 90, 100) |
| 96 | +python openevolve-run.py examples/function_minimization/initial_program.py \ |
| 97 | + examples/function_minimization/evaluator.py \ |
| 98 | + --checkpoint examples/function_minimization/openevolve_output/checkpoints/checkpoint_50 \ |
| 99 | + --iterations 50 |
62 | 100 | ``` |
63 | 101 |
|
64 | | -### 4. Evaluate Results |
| 102 | +### Comparing Results Across Checkpoints |
65 | 103 |
|
66 | | -After OpenEvolve has completed its runs, you can evaluate the performance on different subsets of tasks (e.g., bio, chemical, physics, material). The `eval.py` script collates the results and provides a summary. |
| 104 | +Each checkpoint directory contains the best program found up to that point, making it easy to compare solutions over time: |
67 | 105 |
|
68 | | -```bash |
69 | | -python eval.py <subset_path> |
| 106 | +``` |
| 107 | +checkpoints/ |
| 108 | + checkpoint_10/ |
| 109 | + best_program.py # Best program at iteration 10 |
| 110 | + best_program_info.json # Metrics and details |
| 111 | + programs/ # All programs evaluated so far |
| 112 | + metadata.json # Database state |
| 113 | + checkpoint_20/ |
| 114 | + best_program.py # Best program at iteration 20 |
| 115 | + ... |
70 | 116 | ``` |
71 | 117 |
|
72 | | -For example, to evaluate results for the 'physics' subset located in `./problems/phys_osc/`, you would run: |
| 118 | +You can compare the evolution of solutions by examining the best programs at different checkpoints: |
73 | 119 |
|
74 | 120 | ```bash |
75 | | -python eval.py ./problems/phys_osc |
| 121 | +# Compare best programs at different checkpoints |
| 122 | +diff -u checkpoints/checkpoint_10/best_program.py checkpoints/checkpoint_20/best_program.py |
| 123 | + |
| 124 | +# Compare metrics |
| 125 | +cat checkpoints/checkpoint_*/best_program_info.json | grep -A 10 metrics |
76 | 126 | ``` |
| 127 | +### Docker |
77 | 128 |
|
78 | | -This script will also save a `JSON` file containing detailed results for your analysis. |
| 129 | +You can also install and execute via Docker: |
| 130 | +```bash |
| 131 | +docker build -t openevolve . |
| 132 | +docker run --rm -v .:/app openevolve examples/function_minimization/initial_program.py examples/function_minimization/evaluator.py --config examples/function_minimization/config.yaml --iterations 1000 |
| 133 | +``` |
79 | 134 |
|
80 | | ------- |
| 135 | +## Configuration |
81 | 136 |
|
82 | | -## 🌱 Algorithm Evolution: From Linear Model to Complex Expression |
| 137 | +OpenEvolve is highly configurable. You can specify configuration options in a YAML file: |
83 | 138 |
|
84 | | -OpenEvolve works by iteratively modifying an initial Python program to find a better-fitting mathematical expression. |
| 139 | +```yaml |
| 140 | +# Example configuration |
| 141 | +max_iterations: 1000 |
| 142 | +llm: |
| 143 | + primary_model: "gemini-2.0-flash-lite" |
| 144 | + secondary_model: "gemini-2.0-flash" |
| 145 | + temperature: 0.7 |
| 146 | +database: |
| 147 | + population_size: 500 |
| 148 | + num_islands: 5 |
| 149 | +``` |
85 | 150 |
|
86 | | -### Initial Algorithm (Example: Linear Model) |
| 151 | +Sample configuration files are available in the `configs/` directory: |
| 152 | +- `default_config.yaml`: Comprehensive configuration with all available options |
87 | 153 |
|
88 | | -The `data_api.py` script typically generates a basic linear model as the starting point. For a given task, this `initial_program.py` might look like this: |
| 154 | +See the [Configuration Guide](configs/default_config.yaml) for a full list of options. |
89 | 155 |
|
90 | | -```python |
91 | | -""" |
92 | | -Initial program: A naive linear model for symbolic regression. |
93 | | -This model predicts the output as a linear combination of input variables |
94 | | -or a constant if no input variables are present. |
95 | | -The function is designed for vectorized input (X matrix). |
96 | | -
|
97 | | -Target output variable: dv_dt (Acceleration in Nonl-linear Harmonic Oscillator) |
98 | | -Input variables (columns of x): x (Position at time t), t (Time), v (Velocity at time t) |
99 | | -""" |
100 | | -import numpy as np |
101 | | - |
102 | | -# Input variable mapping for x (columns of the input matrix): |
103 | | -# x[:, 0]: x (Position at time t) |
104 | | -# x[:, 1]: t (Time) |
105 | | -# x[:, 2]: v (Velocity at time t) |
106 | | - |
107 | | -# Parameters will be optimized by BFGS outside this function. |
108 | | -# Number of parameters expected by this model: 10. |
109 | | -# Example initialization: params = np.random.rand(10) |
110 | | - |
111 | | -# EVOLVE-BLOCK-START |
112 | | - |
113 | | -def func(x, params): |
114 | | - """ |
115 | | - Calculates the model output using a linear combination of input variables |
116 | | - or a constant value if no input variables. Operates on a matrix of samples. |
117 | | -
|
118 | | - Args: |
119 | | - x (np.ndarray): A 2D numpy array of input variable values, shape (n_samples, n_features). |
120 | | - n_features is 3. |
121 | | - If n_features is 0, x should be shape (n_samples, 0). |
122 | | - The order of columns in x must correspond to: |
123 | | - (x, t, v). |
124 | | - params (np.ndarray): A 1D numpy array of parameters. |
125 | | - Expected length: 10. |
126 | | -
|
127 | | - Returns: |
128 | | - np.ndarray: A 1D numpy array of predicted output values, shape (n_samples,). |
129 | | - """ |
130 | | - |
131 | | - result = x[:, 0] * params[0] + x[:, 1] * params[1] + x[:, 2] * params[2] |
132 | | - return result |
133 | | - |
134 | | -# EVOLVE-BLOCK-END |
135 | | - |
136 | | -# This part remains fixed (not evolved) |
137 | | -# It ensures that OpenEvolve can consistently call the evolving function. |
138 | | -def run_search(): |
139 | | - return func |
140 | | - |
141 | | -# Note: The actual structure of initial_program.py is determined by data_api.py. |
142 | | -``` |
| 156 | +## Examples |
143 | 157 |
|
144 | | -### Evolved Algorithm (Discovered Symbolic Expression) |
| 158 | +See the `examples/` directory for complete examples of using OpenEvolve on various problems: |
145 | 159 |
|
146 | | -OpenEvolve will iteratively modify the Python code within the `# EVOLVE-BLOCK-START` and `# EVOLVE-BLOCK-END` markers in `initial_program.py`. The goal is to transform the simple initial model into a more complex and accurate symbolic expression that minimizes the Mean Squared Error (MSE) on the training data. |
| 160 | +### Circle Packing |
147 | 161 |
|
148 | | -An evolved `func` might, for instance, discover a non-linear expression like: |
| 162 | +Our implementation of the circle packing problem from the AlphaEvolve paper. For the n=26 case, where one needs to pack 26 circles in a unit square we also obtain SOTA results. |
149 | 163 |
|
150 | | -```python |
151 | | -# Hypothetical example of what OpenEvolve might find: |
152 | | -def func(x, params): |
153 | | - # Assuming X_train_scaled maps to x and const maps to a parameter in params |
154 | | - predictions = np.sin(x[:, 0]) * x[:, 1]**2 + params[0] |
155 | | - return predictions |
156 | | -``` |
| 164 | +[Explore the Circle Packing Example](examples/circle_packing/) |
157 | 165 |
|
158 | | -*(This is a simplified, hypothetical example to illustrate the transformation.)* |
| 166 | +We have sucessfully replicated the results from the AlphaEvolve paper, below is the packing found by OpenEvolve after 800 iterations |
159 | 167 |
|
160 | | ------- |
| 168 | + |
161 | 169 |
|
162 | | -## ⚙️ Key Configuration & Approach |
| 170 | +This is exactly the packing reported by AlphaEolve in their paper (Figure 14): |
163 | 171 |
|
164 | | -- LLM Models: |
165 | | - - **Primary Model:** `gpt-4o` (or your configured `primary_model`) is typically used for sophisticated code generation and modification. |
166 | | - - **Secondary Model:** `o3` (or your configured `secondary_model`) can be used for refinements, simpler modifications, or other auxiliary tasks within the evolutionary process. |
167 | | -- Evaluation Strategy: |
168 | | - - Currently, this example employs a direct evaluation strategy (not **cascade evaluation**). |
169 | | -- Objective Function: |
170 | | - - The primary objective is to **minimize the Mean Squared Error (MSE)** between the model's predictions and the true values on the training data. |
| 172 | + |
171 | 173 |
|
172 | | ------- |
| 174 | +### Function Minimization |
173 | 175 |
|
174 | | -## 📊 Results |
| 176 | +An example showing how OpenEvolve can transform a simple random search algorithm into a sophisticated simulated annealing approach. |
175 | 177 |
|
176 | | -The `eval.py` script will help you collect and analyze performance metrics. The LLM-SRBench paper provides a comprehensive comparison of various baselines. For results generated by this specific OpenEvolve example, you should run the evaluation script as described in the "Getting Started" section. |
| 178 | +[Explore the Function Minimization Example](examples/function_minimization/) |
177 | 179 |
|
178 | | -For benchmark-wide comparisons and results from other methods, please refer to the official LLM-SRBench paper. |
| 180 | +## Preparing Your Own Problems |
179 | 181 |
|
180 | | -| **Task Category** | Med. NMSE (Test) | Med. R2 (Test) | **Med. NMSE (OOD Test)** | **Med. R2 (OOD Test)** | |
181 | | -| ----------------------- | ---------------- | -------------- | ------------------------ | ---------------------- | |
182 | | -| Chemistry (36 tasks) | 2.3419e-06 | 1.000 | 3.1384e-02 | 0.9686 | |
183 | | -| Biology (24 tasks) | | | | | |
184 | | -| Physics (44 tasks) | 1.8548e-05 | 1.000 | 7.9255e-04 | 0.9992 | |
185 | | -| Material Sc. (25 tasks) | | | | | |
| 182 | +To use OpenEvolve for your own problems: |
186 | 183 |
|
187 | | ------- |
| 184 | +1. **Mark code sections** to evolve with `# EVOLVE-BLOCK-START` and `# EVOLVE-BLOCK-END` comments |
| 185 | +2. **Create an evaluation function** that returns a dictionary of metrics |
| 186 | +3. **Configure OpenEvolve** with appropriate parameters |
| 187 | +4. **Run the evolution** process |
188 | 188 |
|
189 | | -## 🤝 Contribution |
| 189 | +## Citation |
190 | 190 |
|
191 | | -This OpenEvolve example for LLM-SRBench was implemented by [**Haowei Lin**](https://linhaowei1.github.io/) from Peking University. If you encounter any issues or have questions, please feel free to reach out to Haowei via email ([email protected]) for discussion. |
| 191 | +If you use OpenEvolve in your research, please cite: |
192 | 192 |
|
| 193 | +``` |
| 194 | +@software{openevolve, |
| 195 | + title = {OpenEvolve: Open-source implementation of AlphaEvolve}, |
| 196 | + author = {Asankhaya Sharma}, |
| 197 | + year = {2025}, |
| 198 | + publisher = {GitHub}, |
| 199 | + url = {https://github.com/codelion/openevolve} |
| 200 | +} |
| 201 | +``` |
0 commit comments