You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/symbolic_regression/README.md
+94-28Lines changed: 94 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,24 +16,10 @@ This example leverages **LLM-SRBench**, a benchmark specifically designed for La
16
16
17
17
Follow these steps to set up and run the symbolic regression benchmark example:
18
18
19
-
### 1. Configure API Secrets
19
+
### 1. Configure API Keys
20
20
21
-
You'll need to provide your API credentials for the language models used by OpenEvolve.
21
+
The API key is read from the environment `OPENAI_API_KEY` by default. The primary and secondary model we used in testing LLM-SRBench is `gpt-4o` and `o3`. You can check `create_config()` in `data_api.py`.
22
22
23
-
- Create a `secrets.yaml` file in the example directory.
24
-
- Add your API key and model preferences:
25
-
26
-
YAML
27
-
28
-
```
29
-
# secrets.yaml
30
-
api_key: <YOUR_OPENAI_API_KEY>
31
-
api_base: "https://api.openai.com/v1" # Or your custom endpoint
32
-
primary_model: "gpt-4o"
33
-
secondary_model: "o3" # Or another preferred model for specific tasks
34
-
```
35
-
36
-
Replace `<YOUR_OPENAI_API_KEY>` with your actual OpenAI API key.
OpenEvolve will iteratively modify the Python code within the `# EVOLVE-BLOCK-START` and `# EVOLVE-BLOCK-END` markers in `initial_program.py`. The goal is to transform the simple initial model into a more complex and accurate symbolic expression that minimizes the Mean Squared Error (MSE) on the training data.
132
+
**OpenEvolve**iteratively modifies Python code segments, delineated by `# EVOLVE-BLOCK-START` and `# EVOLVE-BLOCK-END` markers within an `initial_program.py` file. The primary objective is to evolve a simple initial model into a more complex and accurate symbolic expression that minimizes the Mean Squared Error (MSE) against the training data.
147
133
148
-
An evolved `func` might, for instance, discover a non-linear expression like:
134
+
Below is a symbolic expression discovered by OpenEvolve for the physics task `PO10`:
149
135
150
136
```python
151
-
# Hypothetical example of what OpenEvolve might find:
137
+
import numpy as np
138
+
152
139
deffunc(x, params):
153
-
# Assuming X_train_scaled maps to x and const maps to a parameter in params
Notably, the core functional forms present in this ground truth equation are captured by the evolved symbolic expression:
200
+
201
+
- The $sin(t)$ component can be represented by `params[4] * np.sin(params[5] * t_val)`.
202
+
- The linear $x(t)$ term corresponds to `params[0] * pos`.
203
+
- The cubic $x(t)^3$ term is `params[1] * pos**3`.
204
+
- The interaction term $t⋅x(t)$ is captured by `params[8] * pos * t_val`.
205
+
206
+
The evolved code also includes terms like `params[2] * np.cos(params[3] * t_val)` (a cosine forcing term) and `params[9]` (a constant bias). These might evolve to have negligible parameter values if not supported by the data, or they could capture secondary effects or noise. The inclusion of the primary terms demonstrates OpenEvolve's strength in identifying the correct underlying structure of the equation.
207
+
208
+
*Note: Symbolic regression, despite such promising results, remains a very challenging task. This difficulty largely stems from the inherent complexities of inferring precise mathematical models from finite and potentially noisy training data, which provides only a partial observation of the true underlying system.*
159
209
160
210
------
161
211
@@ -177,12 +227,28 @@ The `eval.py` script will help you collect and analyze performance metrics. The
177
227
178
228
For benchmark-wide comparisons and results from other methods, please refer to the official LLM-SRBench paper.
0 commit comments