Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -53,3 +53,6 @@ htmlcov/
# For SR
secrets.yaml
problems

papers/data
papers/.env
217 changes: 199 additions & 18 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,30 @@ make install

### Running Tests
```bash
# Run all tests
# Run unit tests only (fast, no LLM required)
python -m unittest discover tests

# Or use Makefile
make test

# Run integration tests (requires optillm)
make test-integration

# Run all tests
make test-all

# Run single test file
python -m unittest tests.test_database

# Run single test case
python -m unittest tests.test_database.TestProgramDatabase.test_add_and_get
```

**Note**: Unit tests require `OPENAI_API_KEY` environment variable to be set (can be any placeholder value like `test-key`). Integration tests need optillm server running.

### Code Formatting
```bash
# Format with Black
# Format with Black (line length: 100)
python -m black openevolve examples tests scripts

# Or use Makefile
Expand All @@ -45,10 +59,16 @@ python openevolve-run.py path/to/initial_program.py path/to/evaluator.py \
--config path/to/config.yaml \
--checkpoint path/to/checkpoint_directory \
--iterations 50

# Using the CLI entry point (installed via pip)
openevolve-run path/to/initial_program.py path/to/evaluator.py --config path/to/config.yaml --iterations 1000
```

### Visualization
```bash
# Install visualization dependencies first
pip install -r scripts/requirements.txt

# View evolution tree
python scripts/visualizer.py --path examples/function_minimization/openevolve_output/checkpoints/checkpoint_100/
```
Expand All @@ -57,23 +77,43 @@ python scripts/visualizer.py --path examples/function_minimization/openevolve_ou

### Core Components

1. **Controller (`openevolve/controller.py`)**: Main orchestrator that manages the evolution process using ProcessPoolExecutor for parallel iteration execution.
1. **Controller (`openevolve/controller.py`)**: Main orchestrator that manages the evolution process:
- Coordinates evolution loop and checkpointing
- Manages ProcessPoolExecutor for parallel iteration execution
- Handles graceful shutdown and state persistence

2. **Process Parallel (`openevolve/process_parallel.py`)**: True parallel execution layer:
- Worker pool with process-based isolation
- Each worker loads database snapshot for independent evolution
- Lazy initialization of LLM/evaluator components per worker
- Preserves parent environment variables in child processes

2. **Database (`openevolve/database.py`)**: Implements MAP-Elites algorithm with island-based evolution:
- Programs mapped to multi-dimensional feature grid
3. **Database (`openevolve/database.py`)**: Implements MAP-Elites algorithm with island-based evolution:
- Programs mapped to multi-dimensional feature grid (`Program` dataclass)
- Multiple isolated populations (islands) evolve independently
- Periodic migration between islands prevents convergence
- Tracks absolute best program separately
- Periodic migration between islands prevents convergence (lazy migration based on generation counts)
- Tracks absolute best program separately (`best_program_id`)
- Per-island best tracking (`island_best_programs`)
- Feature binning can be uniform (int) or per-dimension (dict)

3. **Evaluator (`openevolve/evaluator.py`)**: Cascade evaluation pattern:
- Stage 1: Quick validation
- Stage 2: Basic performance testing
4. **Evaluator (`openevolve/evaluator.py`)**: Cascade evaluation pattern:
- Stage 1: Quick validation (syntax/imports)
- Stage 2: Basic performance testing
- Stage 3: Comprehensive evaluation
- Programs must pass thresholds at each stage
- Supports timeout protection and artifact collection

4. **LLM Integration (`openevolve/llm/`)**: Ensemble approach with multiple models, configurable weights, and async generation with retry logic.
5. **LLM Integration (`openevolve/llm/`)**: Ensemble approach with multiple models:
- Weighted model selection from configured models
- Async generation with retry logic and fallback
- Configurable API base for any OpenAI-compatible endpoint
- Separate evaluator models for LLM-based code quality assessment

5. **Iteration (`openevolve/iteration.py`)**: Worker process that samples from islands, generates mutations via LLM, evaluates programs, and stores artifacts.
6. **Iteration (`openevolve/iteration.py`)**: Worker process that:
- Samples programs from islands using various strategies
- Generates mutations via LLM with prompt context
- Evaluates programs through cascade stages
- Stores artifacts (JSON or files based on size threshold)

### Key Architectural Patterns

Expand Down Expand Up @@ -108,11 +148,152 @@ YAML-based configuration with hierarchical structure:
3. **Error Resilience**: Individual failures don't crash system - extensive retry logic and timeout protection
4. **Prompt Engineering**: Template-based system with context-aware building and evolution history

### Library API

OpenEvolve can be used as a Python library (see `openevolve/api.py`):

```python
from openevolve import run_evolution, evolve_function, EvolutionResult

# Using file paths
result = run_evolution(
initial_program='program.py',
evaluator='evaluator.py',
config='config.yaml',
iterations=100
)

# Using inline code
result = run_evolution(
initial_program='''
# EVOLVE-BLOCK-START
def solve(x): return x * 2
# EVOLVE-BLOCK-END
''',
evaluator=lambda path: {"score": benchmark(path)},
iterations=100
)

# Evolve Python functions directly
def bubble_sort(arr): ...
result = evolve_function(
bubble_sort,
test_cases=[([3,1,2], [1,2,3])],
iterations=50
)
```

### Development Notes

- Python >=3.10 required
- Uses OpenAI-compatible APIs for LLM integration
- Tests use unittest framework
- Black for code formatting
- Artifacts threshold: Small (<10KB) stored in DB, large saved to disk
- Process workers load database snapshots for true parallelism
- Python >=3.10 required (uses dataclasses, type hints)
- Uses OpenAI-compatible APIs for LLM integration (configurable via `api_base`)
- Tests use unittest framework (pytest for integration tests)
- Black for code formatting (line length: 100)
- Artifacts threshold: Small (<10KB) stored in DB as JSON, large saved to disk
- Process workers load database snapshots for true parallelism (no shared state)
- Config uses YAML with hierarchical dataclass structure (`Config`, `DatabaseConfig`, `LLMConfig`, etc.)
- All components are seeded for reproducibility (default seed=42)

## Working with Examples

Each example in `examples/` follows a standard structure:

```
examples/example_name/
├── README.md # Explains the example
├── config.yaml # Evolution configuration
├── initial_program.py # Starting program to evolve
├── evaluator.py # Evaluation logic
└── requirements.txt # (optional) Additional dependencies
```

### Creating a New Example

1. Copy an existing example as template (e.g., `function_minimization/`)
2. Modify `initial_program.py` with code to evolve (mark with `# EVOLVE-BLOCK-START/END`)
3. Write evaluator that returns metrics dict: `{"score": float, ...}`
4. Configure `config.yaml` with appropriate system message and parameters
5. Test with small iteration count first

### Evaluator Requirements

Evaluators must accept a program path and return metrics:

```python
def evaluate(program_path: str) -> Union[Dict[str, float], EvaluationResult]:
# Option 1: Return dict with metrics
return {"score": 0.85, "accuracy": 0.9}

# Option 2: Return EvaluationResult with artifacts
from openevolve.evaluation_result import EvaluationResult
return EvaluationResult(
metrics={"score": 0.85},
artifacts={"stderr": "...", "llm_feedback": "..."}
)
```

## Configuration Deep Dive

### Key Config Parameters

- `max_iterations`: Total evolution iterations
- `random_seed`: For reproducibility (set to same value for identical runs)
- `output_dir`: Where to save checkpoints and results

### LLM Configuration

```yaml
llm:
api_base: "https://api.openai.com/v1" # Or Gemini, Ollama, etc.
models:
- name: "gpt-4"
weight: 0.6
- name: "gpt-3.5-turbo"
weight: 0.4
temperature: 0.7
max_tokens: 4096
```

### Island Evolution Settings

```yaml
database:
num_islands: 5 # More = more diversity, slower convergence
migration_interval: 50 # Generations between migrations (not iterations!)
migration_rate: 0.1 # Fraction of top programs to migrate
feature_dimensions: # Quality-diversity dimensions
- "complexity" # Built-in: code length
- "diversity" # Built-in: edit distance
- "custom_metric" # Custom: returned from evaluator
```

**Important**: `migration_interval` is in generations, not iterations. Each island tracks its own generation count independently.

### Prompt Configuration

```yaml
prompt:
system_message: |
You are an expert programmer...
num_top_programs: 3 # Best performers for inspiration
num_diverse_programs: 2 # Diverse programs for exploration
use_template_stochasticity: true # Randomize prompt templates
```

## Troubleshooting

### Common Issues

1. **OPENAI_API_KEY not set**: Even for unit tests, set to placeholder value
2. **Evolution gets stuck**: Increase `num_diverse_programs` or add more islands
3. **Worker errors**: Check that evaluator doesn't use unpicklable objects (lambdas, local classes)
4. **Memory issues**: Reduce `num_parallel_workers` or `archive_size`
5. **Slow evolution**: Enable `cascade_evaluation` to filter bad programs early

### Debugging Tips

- Enable artifacts to see program errors: `evaluator.enable_artifacts: true`
- Check checkpoint files in `output_dir/checkpoints/` for saved state
- Use `--checkpoint` flag to resume from last successful checkpoint
- Lower `num_parallel_workers` to 1 for easier debugging
- Check `openevolve_output/evolution.log` for detailed execution logs
Loading