Merge pull request #129 from codelion/update-readme

codelion · web-flow · commit ce4570e48ea1 · 2025-07-08T17:45:49.000+08:00
Revamp README with expanded features and examples
diff --git a/README.md b/README.md
@@ -1,33 +1,94 @@
 # OpenEvolve
 
-An open-source implementation of the AlphaEvolve system described in the Google DeepMind paper "AlphaEvolve: A coding agent for scientific and algorithmic discovery" (2025).
+An open-source evolutionary coding agent that began as a faithful implementation of AlphaEvolve and has evolved far beyond it, enabling automated scientific and algorithmic discovery.
 
 ![OpenEvolve Logo](openevolve-logo.png)
 
 ## Overview
 
-OpenEvolve is an evolutionary coding agent that uses Large Language Models to optimize code through an iterative process. It orchestrates a pipeline of LLM-based code generation, evaluation, and selection to continuously improve programs for a variety of tasks.
-
-Key features:
-- Evolution of entire code files, not just single functions
-- Support for multiple programming languages
-- Supports OpenAI-compatible APIs for any LLM
-- Multi-objective optimization
-- Flexible prompt engineering
-- Distributed evaluation
+OpenEvolve is an evolutionary coding agent that uses Large Language Models to automatically optimize and discover algorithms through iterative improvement. Starting from the AlphaEvolve research, it incorporates advanced features for reproducibility, multi-language support, sophisticated evaluation pipelines, and integration with cutting-edge LLM optimization techniques. It serves as both a research platform for evolutionary AI and a practical tool for automated code optimization.
+
+### Key Features
+
+OpenEvolve implements a comprehensive evolutionary coding system with:
+
+- **Evolutionary Coding Agent**: LLM-guided evolution of entire code files (not just functions)
+- **Distributed Controller Loop**: Asynchronous pipeline coordinating LLMs, evaluators, and databases
+- **Program Database**: Storage and sampling of evolved programs with evaluation metrics
+- **Prompt Sampling**: Context-rich prompts with past programs, scores, and problem descriptions  
+- **LLM Ensemble**: Multiple language models working together for code generation
+- **Multi-objective Optimization**: Simultaneous optimization of multiple evaluation metrics
+- **Checkpoint System**: Automatic saving and resuming of evolution state
+
+#### 🔬 **Scientific Reproducibility**
+- **Comprehensive Seeding**: Full deterministic reproduction with hash-based component isolation
+- **Default Reproducibility**: Seed=42 by default for immediate reproducible results
+- **Granular Control**: Per-component seeding for LLMs, database, and evaluation pipeline
+
+#### 🤖 **Advanced LLM Integration**  
+- **Ensemble Sophistication**: Weighted model combinations with intelligent fallback strategies
+- **Test-Time Compute**: Integration with [optillm](https://github.com/codelion/optillm) for Mixture of Agents (MoA) and enhanced reasoning
+- **Universal API Support**: Works with any OpenAI-compatible endpoint (Anthropic, Google, local models)
+- **Plugin Ecosystem**: Support for optillm plugins (readurls, executecode, z3_solver, etc.)
+
+#### 🧬 **Evolution Algorithm Innovations**
+- **MAP-Elites Implementation**: Quality-diversity algorithm for balanced exploration/exploitation  
+- **Island-Based Evolution**: Multiple populations with periodic migration for diversity maintenance
+- **Inspiration vs Performance**: Sophisticated prompt engineering separating top performers from diverse inspirations
+- **Multi-Strategy Selection**: Elite, diverse, and exploratory program sampling strategies
+
+#### 📊 **Evaluation & Feedback Systems**
+- **Artifacts Side-Channel**: Capture build errors, profiling data, and execution feedback for LLM improvement
+- **Cascade Evaluation**: Multi-stage testing with progressive complexity for efficient resource usage
+- **LLM-Based Feedback**: Automated code quality assessment and reasoning capture
+- **Comprehensive Error Handling**: Graceful recovery from evaluation failures with detailed diagnostics
+
+#### 🌐 **Multi-Language & Platform Support**
+- **Language Agnostic**: Python, Rust, R, Metal shaders, and more
+- **Platform Optimization**: Apple Silicon GPU kernels, CUDA optimization, CPU-specific tuning
+- **Framework Integration**: MLX, PyTorch, scientific computing libraries
+
+#### 🔧 **Developer Experience & Tooling**
+- **Real-Time Visualization**: Interactive web-based evolution tree viewer with performance analytics
+- **Advanced CLI**: Rich command-line interface with checkpoint management and configuration override
+- **Comprehensive Examples**: 12+ diverse examples spanning optimization, ML, systems programming, and scientific computing
+- **Error Recovery**: Robust checkpoint loading with automatic fix for common serialization issues
+
+#### 🚀 **Performance & Scalability**
+- **Threaded Parallelism**: High-throughput asynchronous evaluation pipeline
+- **Resource Management**: Memory limits, timeouts, and resource monitoring
+- **Efficient Storage**: Optimized database with artifact management and cleanup policies
 
 ## How It Works
 
-OpenEvolve follows an evolutionary approach with the following components:
+OpenEvolve orchestrates a sophisticated evolutionary pipeline:
 
 ![OpenEvolve Architecture](openevolve-architecture.png)
 
-1. **Prompt Sampler**: Creates context-rich prompts containing past programs, their scores, and problem descriptions
-2. **LLM Ensemble**: Generates code modifications via an ensemble of language models
-3. **Evaluator Pool**: Tests generated programs and assigns scores
-4. **Program Database**: Stores programs and their evaluation metrics, guiding future evolution
+### Core Evolution Loop
+
+1. **Enhanced Prompt Sampler**: Creates rich prompts containing:
+   - Top-performing programs (for optimization guidance)  
+   - Diverse inspiration programs (for creative exploration)
+   - Execution artifacts and error feedback
+   - Dynamic documentation fetching (via optillm plugins)
+
+2. **Intelligent LLM Ensemble**: 
+   - Weighted model combinations for quality/speed tradeoffs
+   - Test-time compute techniques (MoA, chain-of-thought, reflection)
+   - Deterministic selection with comprehensive seeding
+
+3. **Advanced Evaluator Pool**:
+   - Multi-stage cascade evaluation
+   - Artifact collection for detailed feedback
+   - LLM-based code quality assessment
+   - Parallel execution with resource limits
 
-The controller orchestrates interactions between these components in an asynchronous pipeline, maximizing throughput to evaluate as many candidate solutions as possible.
+4. **Sophisticated Program Database**:
+   - MAP-Elites algorithm for quality-diversity balance
+   - Island-based populations with migration
+   - Feature map clustering and archive management
+   - Comprehensive metadata and lineage tracking
 
 ## Getting Started
 
@@ -192,22 +253,45 @@ docker run --rm -v $(pwd):/app --network="host" openevolve examples/function_min
 
 ## Configuration
 
-OpenEvolve is highly configurable. You can specify configuration options in a YAML file:
+OpenEvolve is highly configurable with advanced options:
 
 ```yaml
-# Example configuration
+# Example configuration showcasing advanced features
 max_iterations: 1000
+random_seed: 42  # Full reproducibility by default
+
 llm:
-  primary_model: "gemini-2.0-flash-lite"
-  secondary_model: "gemini-2.0-flash"
+  # Advanced ensemble configuration
+  models:
+    - name: "gemini-2.0-flash-lite"
+      weight: 0.7
+    - name: "moa&readurls-gemini-2.0-flash"  # optillm test-time compute
+      weight: 0.3
   temperature: 0.7
+  
 database:
+  # MAP-Elites configuration
   population_size: 500
-  num_islands: 5
+  num_islands: 5  # Island-based evolution
+  migration_interval: 20
+  feature_dimensions: ["score", "complexity"]  # Quality-diversity features
+  
+evaluator:
+  # Advanced evaluation features
+  enable_artifacts: true  # Capture execution feedback
+  cascade_evaluation: true  # Multi-stage testing
+  use_llm_feedback: true  # AI-based code quality assessment
+  
+prompt:
+  # Sophisticated prompt engineering
+  num_top_programs: 3      # Performance examples
+  num_diverse_programs: 2  # Creative inspiration
+  include_artifacts: true  # Execution feedback
 ```
 
 Sample configuration files are available in the `configs/` directory:
 - `default_config.yaml`: Comprehensive configuration with all available options
+- `island_config_example.yaml`: Advanced island-based evolution setup
 
 See the [Configuration Guide](configs/default_config.yaml) for a full list of options.
 
@@ -287,37 +371,60 @@ export ENABLE_ARTIFACTS=false
 
 See the `examples/` directory for complete examples of using OpenEvolve on various problems:
 
-### Symbolic Regression
+### Mathematical Optimization
+
+#### [Function Minimization](examples/function_minimization/)
+A comprehensive example demonstrating evolution from random search to sophisticated simulated annealing.
 
-A comprehensive example demonstrating OpenEvolve's application to symbolic regression tasks using the LLM-SRBench benchmark. This example shows how OpenEvolve can evolve simple mathematical expressions (like linear models) into complex symbolic formulas that accurately fit scientific datasets.
+#### [Circle Packing](examples/circle_packing/)
+Our implementation of the circle packing problem. For the n=26 case, we achieve state-of-the-art results matching published benchmarks.
 
-[Explore the Symbolic Regression Example](examples/symbolic_regression/)
+Below is the optimal packing found by OpenEvolve after 800 iterations:
 
-Key features:
-- Automatic generation of initial programs from benchmark tasks
-- Evolution from simple linear models to complex mathematical expressions
-- Evaluation on physics, chemistry, biology, and material science datasets
-- Competitive results compared to state-of-the-art symbolic regression methods
+![circle-packing-result](https://github.com/user-attachments/assets/00100f9e-2ac3-445b-9266-0398b7174193)
 
-### Circle Packing
+### Advanced AI & LLM Integration
 
-Our implementation of the circle packing problem from the AlphaEvolve paper. For the n=26 case, where one needs to pack 26 circles in a unit square we also obtain SOTA results.
+#### [Web Scraper with optillm](examples/web_scraper_optillm/)
+Demonstrates integration with [optillm](https://github.com/codelion/optillm) for test-time compute optimization, including:
+- **readurls plugin**: Automatic documentation fetching
+- **Mixture of Agents (MoA)**: Multi-response synthesis for improved accuracy  
+- **Local model optimization**: Enhanced reasoning with smaller models
 
-[Explore the Circle Packing Example](examples/circle_packing/)
+#### [LLM Prompt Optimization](examples/llm_prompt_optimazation/)
+Evolving prompts themselves for better LLM performance, demonstrating self-improving AI systems.
 
-We have sucessfully replicated the results from the AlphaEvolve paper, below is the packing found by OpenEvolve after 800 iterations
+### Systems & Performance Optimization
 
-![alpha-evolve-replication](https://github.com/user-attachments/assets/00100f9e-2ac3-445b-9266-0398b7174193)
+#### [MLX Metal Kernel Optimization](examples/mlx_metal_kernel_opt/)
+Automated discovery of custom GPU kernels for Apple Silicon, achieving:
+- **2-3x speedup** over baseline attention implementations
+- **Hardware-aware optimizations** for unified memory architecture
+- **Metal shader evolution** with numerical correctness validation
 
-This is exactly the packing reported by AlphaEvolve in their paper (Figure 14):
+#### [Rust Adaptive Sort](examples/rust_adaptive_sort/)
+Evolution of sorting algorithms that adapt to data patterns, showcasing OpenEvolve's language-agnostic capabilities.
 
-![alpha-evolve-results](https://github.com/user-attachments/assets/0c9affa5-053d-404e-bb2d-11479ab248c9)
+### Scientific Computing & Discovery
 
-### Function Minimization
+#### [Symbolic Regression](examples/symbolic_regression/)
+A comprehensive example demonstrating automated discovery of mathematical expressions from scientific datasets using the LLM-SRBench benchmark.
+
+#### [R Robust Regression](examples/r_robust_regression/)
+Developing robust regression methods resistant to outliers using R language support.
+
+#### [Signal Processing](examples/signal_processing/)
+Automated design of digital filters with superior performance characteristics.
+
+### Web and Integration Examples
+
+#### [Online Judge Programming](examples/online_judge_programming/)
+Automated competitive programming solution generation with external evaluation systems.
+
+#### [LM-Eval Integration](examples/lm_eval/)
+Working with standard ML evaluation harnesses for automated benchmark improvement.
 
-An example showing how OpenEvolve can transform a simple random search algorithm into a sophisticated simulated annealing approach.
 
-[Explore the Function Minimization Example](examples/function_minimization/)
 
 ## Preparing Your Own Problems
 
@@ -334,10 +441,12 @@ If you use OpenEvolve in your research, please cite:
 
 ```
 @software{openevolve,
-  title = {OpenEvolve: Open-source implementation of AlphaEvolve},
+  title = {OpenEvolve: an open-source evolutionary coding agent},
   author = {Asankhaya Sharma},
   year = {2025},
   publisher = {GitHub},
   url = {https://github.com/codelion/openevolve}
 }
 ```
+
+