|
| 1 | +# Web Scraper Evolution with optillm |
| 2 | + |
| 3 | +This example demonstrates how to use [optillm](https://github.com/codelion/optillm) with OpenEvolve to leverage test-time compute techniques for improved code evolution accuracy. We'll evolve a web scraper that extracts structured data from documentation pages, showcasing two key optillm features: |
| 4 | + |
| 5 | +1. **readurls plugin**: Automatically fetches webpage content when URLs are mentioned in prompts |
| 6 | +2. **Inference optimization**: Uses techniques like Mixture of Agents (MoA) to improve response accuracy |
| 7 | + |
| 8 | +## Why optillm? |
| 9 | + |
| 10 | +Traditional LLM usage in code evolution has limitations: |
| 11 | +- LLMs may not have knowledge of the latest library documentation |
| 12 | +- Single LLM calls can produce inconsistent or incorrect code |
| 13 | +- No ability to dynamically fetch relevant documentation during evolution |
| 14 | + |
| 15 | +optillm solves these problems by: |
| 16 | +- **Dynamic Documentation Fetching**: The readurls plugin automatically fetches and includes webpage content when URLs are detected in prompts |
| 17 | +- **Test-Time Compute**: Techniques like MoA generate multiple responses and synthesize the best solution |
| 18 | +- **Flexible Routing**: Can route requests to different models based on requirements |
| 19 | + |
| 20 | +## Problem Description |
| 21 | + |
| 22 | +We're evolving a web scraper that extracts API documentation from Python library documentation pages. The scraper needs to: |
| 23 | +1. Parse HTML documentation pages |
| 24 | +2. Extract function signatures, descriptions, and parameters |
| 25 | +3. Structure the data in a consistent format |
| 26 | +4. Handle various documentation formats |
| 27 | + |
| 28 | +This is an ideal problem for optillm because: |
| 29 | +- The LLM benefits from seeing actual documentation HTML structure |
| 30 | +- Accuracy is crucial for correct parsing |
| 31 | +- Different documentation sites have different formats |
| 32 | + |
| 33 | +## Architecture |
| 34 | + |
| 35 | +``` |
| 36 | +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ |
| 37 | +│ OpenEvolve │────▶│ optillm │────▶│ Local LLM │ |
| 38 | +│ │ │ (proxy:8000) │ │ (Qwen-0.5B) │ |
| 39 | +└─────────────────┘ └─────────────────┘ └─────────────────┘ |
| 40 | + │ |
| 41 | + ├── readurls plugin |
| 42 | + │ (fetches web content) |
| 43 | + │ |
| 44 | + └── MoA optimization |
| 45 | + (improves accuracy) |
| 46 | +``` |
| 47 | + |
| 48 | +## Setup Instructions |
| 49 | + |
| 50 | +### 1. Install and Configure optillm |
| 51 | + |
| 52 | +```bash |
| 53 | +# Clone optillm |
| 54 | +git clone https://github.com/codelion/optillm.git |
| 55 | +cd optillm |
| 56 | + |
| 57 | +# Install dependencies |
| 58 | +pip install -r requirements.txt |
| 59 | + |
| 60 | +# Start optillm proxy with local inference server (in a separate terminal) |
| 61 | +export OPTILLM_API_KEY=optillm |
| 62 | +python optillm.py --port 8000 |
| 63 | +``` |
| 64 | + |
| 65 | +optillm will now be running on `http://localhost:8000` with its built-in local inference server. |
| 66 | + |
| 67 | +**Note for Non-Mac Users**: This example uses `Qwen/Qwen3-0.6B-MLX-bf16` which is optimized for Apple Silicon (M1/M2/M3 chips). If you're not using a Mac, you should: |
| 68 | + |
| 69 | +1. **For NVIDIA GPUs**: Use a CUDA-compatible model like: |
| 70 | + - `Qwen/Qwen2.5-32B-Instruct` (best quality, high VRAM) |
| 71 | + - `Qwen/Qwen2.5-14B-Instruct` (good balance) |
| 72 | + - `meta-llama/Llama-3.1-8B-Instruct` (efficient option) |
| 73 | + - `Qwen/Qwen2.5-7B-Instruct` (lower VRAM) |
| 74 | + |
| 75 | +2. **For CPU-only**: Use a smaller model like: |
| 76 | + - `Qwen/Qwen2.5-7B-Instruct` (7B parameters) |
| 77 | + - `meta-llama/Llama-3.2-3B-Instruct` (3B parameters) |
| 78 | + - `Qwen/Qwen2.5-3B-Instruct` (3B parameters) |
| 79 | + |
| 80 | +3. **Update the config**: Replace the model names in `config.yaml` with your chosen model: |
| 81 | + ```yaml |
| 82 | + models: |
| 83 | + - name: "readurls-your-chosen-model" |
| 84 | + weight: 0.6 |
| 85 | + - name: "moa&readurls-your-chosen-model" |
| 86 | + weight: 0.4 |
| 87 | + ``` |
| 88 | +
|
| 89 | +### 2. Install Web Scraping Dependencies |
| 90 | +
|
| 91 | +```bash |
| 92 | +# Install required Python packages for the example |
| 93 | +pip install -r examples/web_scraper_optillm/requirements.txt |
| 94 | +``` |
| 95 | + |
| 96 | +### 3. Run the Evolution |
| 97 | + |
| 98 | +```bash |
| 99 | +# From the openevolve root directory |
| 100 | +export OPENAI_API_KEY=optillm |
| 101 | +python openevolve-run.py examples/web_scraper_optillm/initial_program.py \ |
| 102 | + examples/web_scraper_optillm/evaluator.py \ |
| 103 | + --config examples/web_scraper_optillm/config.yaml \ |
| 104 | + --iterations 100 |
| 105 | +``` |
| 106 | + |
| 107 | +The configuration demonstrates both optillm capabilities: |
| 108 | +- **Primary model (90%)**: `readurls-Qwen/Qwen3-0.6B-MLX-bf16` - fetches URLs mentioned in prompts |
| 109 | +- **Secondary model (10%)**: `moa&readurls-Qwen/Qwen3-0.6B-MLX-bf16` - uses Mixture of Agents for improved accuracy |
| 110 | + |
| 111 | +## How It Works |
| 112 | + |
| 113 | +### 1. readurls Plugin |
| 114 | + |
| 115 | +When the evolution prompt contains URLs (e.g., "Parse the documentation at https://docs.python.org/3/library/json.html"), the readurls plugin: |
| 116 | +1. Detects the URL in the prompt |
| 117 | +2. Fetches the webpage content |
| 118 | +3. Extracts text and table data |
| 119 | +4. Appends it to the prompt as context |
| 120 | + |
| 121 | +This ensures the LLM has access to the latest documentation structure when generating code. |
| 122 | + |
| 123 | +### 2. Mixture of Agents (MoA) |
| 124 | + |
| 125 | +The MoA technique improves accuracy by: |
| 126 | +1. Generating 3 different solutions to the problem |
| 127 | +2. Having each "agent" critique all solutions |
| 128 | +3. Synthesizing a final, improved solution based on the critiques |
| 129 | + |
| 130 | +This is particularly valuable for complex parsing logic where multiple approaches might be valid. |
| 131 | + |
| 132 | +### 3. Evolution Process |
| 133 | + |
| 134 | +1. **Initial Program**: A basic BeautifulSoup scraper that extracts simple text |
| 135 | +2. **Evaluator**: Tests the scraper against real documentation pages, checking: |
| 136 | + - Correct extraction of function names |
| 137 | + - Accurate parameter parsing |
| 138 | + - Proper handling of edge cases |
| 139 | +3. **Evolution**: The LLM improves the scraper by: |
| 140 | + - Fetching actual documentation HTML (via readurls) |
| 141 | + - Generating multiple parsing strategies (via MoA) |
| 142 | + - Learning from evaluation feedback |
| 143 | + |
| 144 | +## Example Evolution Trajectory |
| 145 | + |
| 146 | +**Generation 1** (Basic scraper): |
| 147 | +```python |
| 148 | +# Simple text extraction |
| 149 | +soup = BeautifulSoup(html, 'html.parser') |
| 150 | +text = soup.get_text() |
| 151 | +``` |
| 152 | + |
| 153 | +**Generation 10** (With readurls context): |
| 154 | +```python |
| 155 | +# Targets specific documentation structures |
| 156 | +functions = soup.find_all('dl', class_='function') |
| 157 | +for func in functions: |
| 158 | + name = func.find('dt').get('id') |
| 159 | + desc = func.find('dd').text |
| 160 | +``` |
| 161 | + |
| 162 | +**Generation 50** (With MoA refinement): |
| 163 | +```python |
| 164 | +# Robust parsing with error handling |
| 165 | +def extract_function_docs(soup): |
| 166 | + # Multiple strategies for different doc formats |
| 167 | + strategies = [ |
| 168 | + lambda: soup.select('dl.function dt'), |
| 169 | + lambda: soup.select('.sig-name'), |
| 170 | + lambda: soup.find_all('code', class_='descname') |
| 171 | + ] |
| 172 | + |
| 173 | + for strategy in strategies: |
| 174 | + try: |
| 175 | + results = strategy() |
| 176 | + if results: |
| 177 | + return parse_results(results) |
| 178 | + except: |
| 179 | + continue |
| 180 | +``` |
| 181 | + |
| 182 | +## Monitoring Progress |
| 183 | + |
| 184 | +Watch the evolution progress and see how optillm enhances the process: |
| 185 | + |
| 186 | +```bash |
| 187 | +# View optillm logs (in the terminal running optillm) |
| 188 | +# You'll see: |
| 189 | +# - URLs being fetched by readurls |
| 190 | +# - Multiple completions generated by MoA |
| 191 | +# - Final synthesized responses |
| 192 | + |
| 193 | +# View OpenEvolve logs |
| 194 | +tail -f examples/web_scraper_optillm/openevolve_output/evolution.log |
| 195 | +``` |
| 196 | + |
| 197 | +## Results |
| 198 | + |
| 199 | +After evolution, you should see: |
| 200 | +1. **Improved Accuracy**: The scraper correctly handles various documentation formats |
| 201 | +2. **Better Error Handling**: Robust parsing that doesn't break on edge cases |
| 202 | +3. **Optimized Performance**: Efficient extraction strategies |
| 203 | + |
| 204 | +Compare the checkpoints to see the evolution: |
| 205 | +```bash |
| 206 | +# Initial vs evolved program |
| 207 | +diff examples/web_scraper_optillm/openevolve_output/checkpoints/checkpoint_10/best_program.py \ |
| 208 | + examples/web_scraper_optillm/openevolve_output/checkpoints/checkpoint_100/best_program.py |
| 209 | +``` |
| 210 | + |
| 211 | +## Key Insights |
| 212 | + |
| 213 | +1. **Documentation Access Matters**: The readurls plugin significantly improves the LLM's ability to generate correct parsing code by providing actual HTML structure |
| 214 | + |
| 215 | +2. **Test-Time Compute Works**: MoA's multiple generation and critique approach produces more robust solutions than single-shot generation |
| 216 | + |
| 217 | +3. **Powerful Local Models**: Large models like Qwen-32B with 4-bit quantization provide excellent results while being memory efficient when enhanced with optillm techniques |
| 218 | + |
| 219 | +## Customization |
| 220 | + |
| 221 | +You can experiment with different optillm features by modifying `config.yaml`: |
| 222 | + |
| 223 | +1. **Different Plugins**: Try the `executecode` plugin for runtime validation |
| 224 | +2. **Other Techniques**: Experiment with `cot_reflection`, `rstar`, or `bon` |
| 225 | +3. **Model Combinations**: Adjust weights or try different technique combinations |
| 226 | + |
| 227 | +Example custom configuration: |
| 228 | +```yaml |
| 229 | +llm: |
| 230 | + models: |
| 231 | + - name: "cot_reflection&readurls-Qwen/Qwen3-0.6B-MLX-bf16" |
| 232 | + weight: 0.7 |
| 233 | + - name: "moa&executecode-Qwen/Qwen3-0.6B-MLX-bf16" |
| 234 | + weight: 0.3 |
| 235 | +``` |
| 236 | +
|
| 237 | +## Troubleshooting |
| 238 | +
|
| 239 | +1. **optillm not responding**: Ensure it's running on port 8000 with `OPTILLM_API_KEY=optillm` |
| 240 | +2. **Model not found**: Make sure optillm's local inference server is working (check optillm logs) |
| 241 | +3. **Slow evolution**: MoA generates multiple completions, so it's slower but more accurate |
| 242 | + |
| 243 | +## Further Reading |
| 244 | + |
| 245 | +- [optillm Documentation](https://github.com/codelion/optillm) |
| 246 | +- [OpenEvolve Configuration Guide](../../configs/default_config.yaml) |
| 247 | +- [Mixture of Agents Paper](https://arxiv.org/abs/2406.04692) |
0 commit comments