Skip to content

Commit 78b2aef

Browse files
committed
Ollama GEPA issue
1 parent 745f6ae commit 78b2aef

File tree

11 files changed

+362
-46
lines changed

11 files changed

+362
-46
lines changed

CHANGELOG.md

Lines changed: 21 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
## [Unreleased]
99

1010
### Added
11-
- Initial release of CodeOptiX
12-
- GEPA optimization engine
13-
- Bloom evaluation framework
14-
- Built-in behaviors: insecure-code, vacuous-tests, plan-drift
15-
- Support for multiple coding agents (Claude Code, Codex, Gemini CLI)
16-
- Multi-provider LLM support (OpenAI, Anthropic, Google, Ollama)
17-
- CI/CD integration
18-
- Comprehensive documentation and examples
1911

2012
### Changed
2113

@@ -27,6 +19,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2719

2820
### Security
2921

22+
## [0.1.3] - 2025-12-27
23+
24+
### Added
25+
- Ollama integration demo script (`examples/ollama_demo.py`) showcasing working local evaluations
26+
- Updated documentation highlighting Ollama integration fixes
27+
28+
### Fixed
29+
- **Ollama Integration**: Fixed Ollama models to properly generate code and provide meaningful evaluation scores instead of always returning 100%. Now uses Ollama's chat API for better conversation handling and includes working demo script.
30+
31+
## [0.1.2] - 2025-12-26
32+
33+
### Added
34+
- Initial release of CodeOptiX
35+
- GEPA optimization engine
36+
- Bloom evaluation framework
37+
- Built-in behaviors: insecure-code, vacuous-tests, plan-drift
38+
- Support for multiple coding agents (Claude Code, Codex, Gemini CLI)
39+
- Multi-provider LLM support (OpenAI, Anthropic, Google, Ollama)
40+
- CI/CD integration
41+
- Comprehensive documentation and examples
42+
3043
## [0.1.0] - 2025-12-26
3144

3245
### Added

docs/guides/ollama-integration.md

Lines changed: 48 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Ollama Integration Guide
22

3-
CodeOptiX supports local Ollama models, allowing you to run evaluations without API keys!
3+
CodeOptiX supports local Ollama models, allowing you to run evaluations without API keys!**Now working correctly** - generates code and provides proper security evaluations.
44

55
---
66

@@ -14,6 +14,27 @@ CodeOptiX supports local Ollama models, allowing you to run evaluations without
1414

1515
---
1616

17+
## ✅ Recent Updates
18+
19+
**CodeOptiX now works correctly with Ollama!** Recent fixes ensure:
20+
21+
- ✅ Proper code generation (not conversational responses)
22+
- ✅ Accurate security evaluations (detects real issues)
23+
- ✅ Meaningful scores (not always 100%)
24+
- ✅ Full evaluation pipeline support
25+
26+
### 🚀 Try the Demo
27+
28+
Test the Ollama integration with our interactive demo:
29+
30+
```bash
31+
python examples/ollama_demo.py
32+
```
33+
34+
This demo shows Ollama generating code, detecting security issues, and providing proper evaluation scores.
35+
36+
---
37+
1738
## 📦 Installation
1839

1940
### Step 1: Install Ollama
@@ -140,11 +161,11 @@ export OLLAMA_BASE_URL=http://remote-server:11434
140161

141162
| Model | Size | Speed | Quality | Use Case |
142163
|-------|------|-------|---------|----------|
143-
| `llama3.1:8b` | 4.9 GB | ⚡⚡⚡ | ⭐⭐⭐ | Fast, efficient |
144-
| `qwen3:8b` | 5.2 GB | ⚡⚡⚡ | ⭐⭐⭐ | Alternative 8B |
145-
| `gpt-oss:120b` | 65 GB || ⭐⭐⭐⭐⭐ | Best quality |
146-
| `gpt-oss:20b` | 13 GB | ⚡⚡ | ⭐⭐⭐⭐ | Good balance |
147-
| `llama3.2:3b` | 2.0 GB |⚡⚡⚡ | ⭐⭐ | Lightweight |
164+
| `llama3.2:3b` | 2.0 GB | ⚡⚡⚡ | ⭐⭐⭐ | **Best for CodeOptiX** - Fast, reliable code generation |
165+
| `llama3.1:8b` | 4.9 GB | ⚡⚡⚡ | ⭐⭐⭐ | Good balance, works well |
166+
| `qwen3:8b` | 5.2 GB |⚡⚡ | ⭐⭐⭐ | Alternative 8B model |
167+
| `gpt-oss:20b` | 13 GB | ⚡⚡ | ⭐⭐⭐⭐ | High quality, slower |
168+
| `gpt-oss:120b` | 65 GB || ⭐⭐⭐⭐⭐ | Best quality, requires powerful hardware |
148169

149170
### List Available Models
150171

@@ -162,7 +183,17 @@ ollama pull <model-name>
162183

163184
## 💡 Usage Examples
164185

165-
### Example 1: Basic Evaluation
186+
### Example 1: Try the Interactive Demo ⭐
187+
188+
See Ollama working in action:
189+
190+
```bash
191+
python examples/ollama_demo.py
192+
```
193+
194+
This demo shows code generation, security evaluation, and proper scoring.
195+
196+
### Example 2: Basic Evaluation
166197

167198
```bash
168199
codeoptix eval \
@@ -171,7 +202,7 @@ codeoptix eval \
171202
--llm-provider ollama
172203
```
173204

174-
### Example 2: With Custom Config
205+
### Example 3: With Custom Config
175206

176207
```bash
177208
codeoptix eval \
@@ -181,7 +212,7 @@ codeoptix eval \
181212
--llm-provider ollama
182213
```
183214

184-
### Example 3: Multiple Behaviors
215+
### Example 4: Multiple Behaviors
185216

186217
```bash
187218
codeoptix eval \
@@ -190,7 +221,7 @@ codeoptix eval \
190221
--llm-provider ollama
191222
```
192223

193-
### Example 4: Verbose Output
224+
### Example 5: Verbose Output
194225

195226
```bash
196227
codeoptix eval \
@@ -200,7 +231,7 @@ codeoptix eval \
200231
--verbose
201232
```
202233

203-
### Example 5: CI/CD Integration
234+
### Example 6: CI/CD Integration
204235

205236
```yaml
206237
# .github/workflows/codeoptix.yml
@@ -300,17 +331,15 @@ export OLLAMA_BASE_URL=http://localhost:11435
300331
- You need maximum speed
301332
- You're okay with API costs
302333

303-
### ⚠️ Limitations
304-
305-
While Ollama works great for evaluations, there are some limitations:
334+
### ⚠️ Known Limitations
306335

307336
#### Evolution Support
308-
- **Limited support for `codeoptix evolve`**: The evolution feature uses GEPA optimization, which requires processing very long prompts. Ollama may fail with 404 errors or timeouts on complex evolution tasks.
309-
- **Recommendation**: Use cloud providers (OpenAI, Anthropic, Google) for full evolution capabilities. For basic evolution testing, try smaller models like `llama3.1:8b` with minimal iterations.
337+
- **Limited support for `codeoptix evolve`**: The evolution feature uses GEPA optimization, which requires processing very long prompts. Ollama may fail with timeouts on complex evolution tasks.
338+
- **Recommendation**: Use cloud providers (OpenAI, Anthropic, Google) for full evolution capabilities.
310339

311-
#### Performance
312-
- Large models (e.g., `gpt-oss:120b`) require significant RAM and may be slow on consumer hardware.
313-
- Evolution tasks are computationally intensive and may not complete reliably with Ollama.
340+
#### Performance Considerations
341+
- Large models (e.g., `gpt-oss:120b`) require significant RAM and may be slow on consumer hardware
342+
- Evolution tasks are computationally intensive and may not complete reliably with Ollama
314343

315344
For advanced features like evolution, consider cloud providers or contact us for tailored enterprise solutions.
316345

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ When AI coding agents dazzle with impressive code but leave you wondering about
5656
!!! tip "Ollama Support - No API Key Required!"
5757
**CodeOptiX supports Ollama** for evaluations - use local models without API keys:
5858

59-
- ✅ **Ollama integration** - Run evaluations with local models
59+
- ✅ **Working Ollama integration** - Generates code and provides proper security evaluations
6060
- ✅ **No API key needed** - Perfect for open-source users
6161
- ✅ **Privacy-friendly** - All processing happens locally
6262
- ✅ **Free to use** - No cloud costs

examples/README.md

Lines changed: 37 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,23 @@ python examples/basic_adapter_usage.py
3131
- Executing tasks with different agents
3232
- Handling agent outputs
3333

34-
### 3. Behavioral Spec Example (`behavioral_spec_example.py`) ⭐
34+
### 3. Ollama Local Demo (`ollama_demo.py`) ⭐
35+
36+
**Local Ollama integration demo** showing that CodeOptiX now works correctly with Ollama.
37+
38+
```bash
39+
python examples/ollama_demo.py
40+
```
41+
42+
**What it shows:**
43+
- Ollama code generation working properly
44+
- Security evaluation detecting real issues
45+
- Proper scoring (not always 100%)
46+
- Local, privacy-friendly evaluations
47+
48+
This is the **recommended starting point** for users who want to use CodeOptiX locally with Ollama.
49+
50+
### 4. Behavioral Spec Example (`behavioral_spec_example.py`)
3551

3652
**Complete end-to-end example** demonstrating a real-world behavioral spec scenario.
3753

@@ -45,7 +61,7 @@ python examples/behavioral_spec_example.py
4561
- Real scenario: Database connection with secret management
4662
- Complete workflow from agent execution to prompt evolution
4763

48-
This is the **recommended starting point** for understanding how CodeOptiX works in practice.
64+
This is the **recommended starting point** for understanding how CodeOptiX works in practice with cloud providers.
4965

5066
## Behavioral Spec Scenarios
5167

@@ -89,23 +105,35 @@ pip install -e ".[dev,docs]"
89105
uv sync --dev --extra docs
90106
```
91107

92-
2. Set API keys (at least one):
93-
```bash
94-
export OPENAI_API_KEY="your-key"
95-
export ANTHROPIC_API_KEY="your-key"
96-
export GOOGLE_API_KEY="your-key"
97-
```
108+
2. Choose your LLM provider:
109+
110+
**For local Ollama usage:**
111+
```bash
112+
# Install Ollama: https://ollama.com
113+
ollama serve # Start Ollama server
114+
ollama pull llama3.2:3b # Pull a model
115+
```
116+
117+
**For cloud providers (set at least one API key):**
118+
```bash
119+
export OPENAI_API_KEY="your-key"
120+
export ANTHROPIC_API_KEY="your-key"
121+
export GOOGLE_API_KEY="your-key"
122+
```
98123

99124
### Run Examples
100125

101126
```bash
127+
# Ollama local demo (recommended for local usage)
128+
python examples/ollama_demo.py
129+
102130
# Quick start with single behavior
103131
python examples/quickstart-single-behavior.py
104132

105133
# Basic adapter usage
106134
python examples/basic_adapter_usage.py
107135

108-
# Complete behavioral spec example (recommended)
136+
# Complete behavioral spec example (recommended for cloud providers)
109137
python examples/behavioral_spec_example.py
110138
```
111139

0 commit comments

Comments
 (0)