You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guides/gepa-optimization.md
+24Lines changed: 24 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -167,6 +167,30 @@ super agent optimize assistant_microsoft --auto medium --framework microsoft --r
167
167
super agent optimize research_agent_deepagents --auto medium --framework deepagents --reflection-lm ollama:llama3.1:8b # DeepAgents
168
168
```
169
169
170
+
**π‘ About Reflection Models**
171
+
172
+
The `--reflection-lm` parameter specifies which model GEPA uses to analyze evaluation results and suggest prompt improvements. We typically recommend using a **smaller, faster model** for reflection:
173
+
174
+
**Why use a smaller reflection model (e.g., llama3.1:8b)?**
175
+
- β **Speed**: GEPA runs the reflection model many times (10-50+ iterations). Smaller models make optimization 5-10x faster
176
+
- β **Resources**: Reduces memory and compute requirements significantly
177
+
- β **Good Enough**: The reflection task (analyzing results, suggesting improvements) is simpler than the agent's actual task
178
+
179
+
**Example:**
180
+
```bash
181
+
# Your agent uses gpt-oss:20b (20B parameters)
182
+
# But reflection uses llama3.1:8b (8B parameters) - much faster!
183
+
super agent optimize my_agent --auto medium --reflection-lm ollama:llama3.1:8b
184
+
```
185
+
186
+
**You can use a larger reflection model if needed:**
187
+
```bash
188
+
# For more sophisticated prompt improvements (slower)
189
+
super agent optimize my_agent --auto medium --reflection-lm ollama:gpt-oss:70b
Copy file name to clipboardExpand all lines: docs/guides/multi-framework.md
+12-3Lines changed: 12 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -82,6 +82,13 @@ super agent evaluate my_agent
82
82
# 4. Optimize with GEPA (works on ALL frameworks!)
83
83
super agent optimize my_agent --auto medium --framework <framework> --reflection-lm ollama:llama3.1:8b
84
84
85
+
# π‘ Why --reflection-lm ollama:llama3.1:8b?
86
+
# The reflection model runs many times during optimization to analyze results
87
+
# and suggest improvements. Using a smaller, faster model (8b vs 20b/70b):
88
+
# β Speeds up optimization 5-10x
89
+
# β Reduces memory/resource usage
90
+
# β Provides good enough reflections (simpler task than the actual agent)
91
+
85
92
# 5. Re-evaluate
86
93
super agent evaluate my_agent # automatically loads optimized weights
87
94
@@ -609,8 +616,10 @@ spec:
609
616
- [Evaluation & Testing](evaluation-testing.md)
610
617
- [SuperSpec DSL](superspec.md)
611
618
619
+
### Tutorials
620
+
621
+
- [**OpenAI SDK + GEPA Optimization Tutorial**](../tutorials/openai-sdk-gepa-optimization.md) - Complete step-by-step guide to building custom agents with native OpenAI SDK patterns and optimizing them with GEPA
622
+
612
623
---
613
624
614
-
**Status**: All 6 frameworks production-ready β
615
-
**GEPA Support**: Universal optimization across all frameworks β
616
-
**Documentation**: Complete β
625
+
Ready to build your own optimized agent? Start with the [OpenAI SDK + GEPA Tutorial](../tutorials/openai-sdk-gepa-optimization.md)!
**Key Insight:** OpenAI SDK achieves better baseline with Ollama!
741
+
**Note:** Actual performance depends on your specific use case, model choice, and BDD scenarios. Always evaluate with your own data.
719
742
720
743
---
721
744
@@ -812,32 +835,42 @@ This is based on the official OpenAI Agents SDK example for Ollama!
812
835
813
836
---
814
837
815
-
## π Success Stories
816
-
817
-
### Baseline Performance
818
-
819
-
**"Great results on the first evaluation!"**
838
+
## π― The SuperOptiX Multi-Framework Advantage
820
839
821
-
With simple, clear BDD scenarios and gpt-oss:20b model, the OpenAI SDK achieved perfect baseline performance. This demonstrates:
840
+
### One Playbook, Multiple Frameworks
822
841
823
-
- Quality of OpenAI SDK design
824
-
- Power of gpt-oss model
825
-
- SuperOptiX multi-framework flexibility
826
-
827
-
### The SuperOptiX Advantage
828
-
829
-
**One playbook, three frameworks, all optimizable:**
842
+
SuperOptiX allows you to write your agent specification once and compile to any supported framework:
830
843
831
844
```bash
832
-
# Try with different frameworks
845
+
# Same playbook, different frameworks
833
846
super agent compile my_agent --framework dspy
834
847
super agent compile my_agent --framework openai
835
848
super agent compile my_agent --framework deepagents
836
849
837
-
# Same GEPA optimization works for all!
850
+
# GEPA optimization works across all frameworks
838
851
super agent optimize my_agent --auto medium
839
852
```
840
853
854
+
### When to Use Each Framework
855
+
856
+
**Choose OpenAI SDK when:**
857
+
- You want simple, straightforward agent design
858
+
- You're using Ollama for local development
859
+
- You need fast prototyping and iteration
860
+
- Your use case is simple to moderate complexity
861
+
862
+
**Choose DSPy when:**
863
+
- You need maximum optimization flexibility
864
+
- You want to optimize multiple components (signatures)
865
+
- You have well-defined, focused tasks
866
+
- You want proven optimization improvements
867
+
868
+
**Choose DeepAgents when:**
869
+
- You need complex planning capabilities
870
+
- You're using cloud models (Claude/GPT-4)
871
+
- You need filesystem context management
872
+
- Your task requires sophisticated multi-step reasoning
873
+
841
874
---
842
875
843
876
## π‘ Tips & Best Practices
@@ -878,23 +911,26 @@ scenarios:
878
911
879
912
## β FAQ
880
913
881
-
**Q: Why use OpenAI SDK instead of DSPy?**
882
-
A: OpenAI SDK has simpler API and works well with Ollama out of the box. Use DSPy for maximum optimization flexibility. Performance varies by hardware and model.
914
+
**Q: Why use OpenAI SDK instead of DSPy?**
915
+
A: OpenAI SDK has a simpler, more straightforward API. It works well with Ollama out of the box. Choose DSPy when you need to optimize multiple components (signatures) or want maximum optimization flexibility.
916
+
917
+
**Q: Does it work with Ollama?**
918
+
A: Yes! OpenAI SDK has full Ollama support. Unlike DeepAgents (which has LangChain function-calling limitations), OpenAI SDK works seamlessly with local models.
883
919
884
-
**Q: Does it work with Ollama?**
885
-
A: Yes! Perfectly! Unlike DeepAgents, OpenAI SDK has no function-calling limitations.
920
+
**Q: Can I use cloud models?**
921
+
A: Yes! Configure your playbook with `provider: openai` and set the `OPENAI_API_KEY` environment variable. Supports OpenAI, Anthropic, and other providers.
886
922
887
-
**Q: Can I use cloud models?**
888
-
A: Yes! Set `model: gpt-4.1` and `OPENAI_API_KEY` environment variable.
923
+
**Q: Does GEPA optimize OpenAI SDK agents?**
924
+
A: Yes! Universal GEPA optimizes the `instructions` field. While OpenAI SDK has fewer optimization targets than DSPy (which optimizes all signatures), GEPA can still improve performance by refining the agent instructions.
889
925
890
-
**Q: Does GEPA optimize OpenAI SDK agents?**
891
-
A: Yes! Universal GEPA optimizes the `instructions` field just like any other framework.
926
+
**Q: Can I use tools with OpenAI SDK agents?**
927
+
A: Yes! Define tools in your playbook under `tools.specific_tools` and implement them using the `@function_tool` decorator in your pipeline code.
892
928
893
-
**Q: Can I use tools?**
894
-
A: Yes! Define tools in playbook and implement with `@function_tool` decorator.
929
+
**Q: What about multi-agent workflows?**
930
+
A: OpenAI SDK supports multi-agent patterns through `handoffs`, where one agent can delegate to another. This is similar to CrewAI's crew concept but with a simpler API.
895
931
896
-
**Q: What about multi-agent?**
897
-
A: Use `handoffs` for agent delegation. Works similar to CrewAI's crew concept.
932
+
**Q: How does performance compare to other frameworks?**
933
+
A: Performance varies by use case, model, and hardware. OpenAI SDK typically has good baseline performance with Ollama. Run your own evaluations with `super agent evaluate` to measure performance for your specific use case.
898
934
899
935
---
900
936
@@ -907,21 +943,58 @@ A: Use `handoffs` for agent delegation. Works similar to CrewAI's crew concept.
0 commit comments