Zeeeepa · codegen-sh · Dec 7, 2025 · Dec 7, 2025 · Dec 8, 2025 · Dec 8, 2025
diff --git a/README_ORCHESTRATION.md b/README_ORCHESTRATION.md
@@ -0,0 +1,120 @@
+# 🚀 Multi-Agent Orchestration for Codegen
+
+A sophisticated multi-agent orchestration framework that enables parallel agent execution, consensus building, and self-healing workflows.
+
+## Quick Start
+
+```python
+from codegen.orchestration import MultiAgentOrchestrator
+
+orchestrator = MultiAgentOrchestrator(
+    api_key="sk-92083737-4e5b-4a48-a2a1-f870a3a096a6",
+    org_id=323
+)
+
+# Council Pattern: 3-stage consensus
+result = await orchestrator.run_council(
+    "What are best practices for REST API authentication?"
+)
+print(result['stage3']['response'])
+
+# Pro Mode: Tournament synthesis
+result = await orchestrator.run_pro_mode(
+    "Write a binary search function",
+    num_runs=20
+)
+print(result['final'])
+
+# Basic Orchestration: N agents + synthesis
+result = await orchestrator.orchestrate(
+    "Create email validation function",
+    num_agents=9
+)
+print(result['final'])
+```
+
+## Patterns
+
+### 1. Council Pattern (3-Stage Consensus)
+
+```
+Stage 1: Individual responses → Stage 2: Peer rankings → Stage 3: Chairman synthesis
+```
+
+**When to use:** Complex questions, consensus needed, peer validation
+
+### 2. Pro Mode (Tournament Synthesis)
+
+```
+N candidates → Group synthesis → Final synthesis
+```
+
+**When to use:** High-quality code generation, exploring solution space
+
+### 3. Basic Orchestration
+
+```
+N agents in parallel → Vote/synthesize → Final response
+```
+
+**When to use:** Simple tasks, quick results
+
+## Features
+
+✅ **Parallel Multi-Agent Execution** - Run multiple Codegen agents simultaneously  
+✅ **3-Stage Council Pattern** - Consensus building with peer rankings  
+✅ **Tournament-Style Synthesis** - Efficient for large agent counts  
+✅ **Automatic Error Recovery** - Built-in retry and fallback logic  
+✅ **Cost Optimization** - Smart caching and early termination  
+
+## Architecture
+
+Based on patterns from:
+- **LLM Council** - Multi-stage consensus building
+- **Pro Mode** - Tournament-style synthesis
+
+Adapted to use **Codegen agent execution** instead of direct API calls.
+
+## Configuration
+
+```python
+# Set via environment or constructor
+CODEGEN_API_KEY = "sk-..."
+CODEGEN_ORG_ID = 323
+COUNCIL_MODELS = ["gpt-4o", "claude-sonnet-4.5", "gemini-3-pro"]
+MAX_PARALLEL_AGENTS = 9
+AGENT_TIMEOUT_SECONDS = 300
+```
+
+## Full Example
+
+```python
+import asyncio
+from codegen.orchestration import MultiAgentOrchestrator
+
+async def main():
+    orchestrator = MultiAgentOrchestrator()
+
+    # Run council for complex question
+    result = await orchestrator.run_council(
+        "Design a scalable microservices architecture"
+    )
+
+    # Access stages
+    print("Individual responses:", len(result['stage1']))
+    print("Peer rankings:", len(result['stage2']))
+    print("Final synthesis:", result['stage3']['response'])
+
+asyncio.run(main())
+```
+
+## See Also
+
+- `src/codegen/orchestration.py` - Full implementation
+- Council Pattern: https://arxiv.org/abs/2305.14867
+- Pro Mode: Tournament-style LLM synthesis
+
+## License
+
+Same as Codegen - see main LICENSE file.
+
diff --git a/improvement_loop.log b/improvement_loop.log
@@ -0,0 +1,145 @@
+🚀 Starting Self-Improvement Loop for Codegen Repository
+================================================================================
+Target: Optimize multi-agent orchestration system
+Goal: <60s per agent, >90% success rate, production-ready CICD loop
+Mode: INFINITE ♾️ (Ctrl+C to stop)
+================================================================================
+================================================================================
+🔄 STARTING INFINITE SELF-IMPROVEMENT LOOP
+================================================================================
+
+
+================================================================================
+🔁 ITERATION 1 (INFINITE)
+================================================================================
+
+📊 Step 1: Analyzing current code...
+
+=== Executing agent 1/1 ===
+[agent_0_1765162401425] Starting agent execution...
+[agent_0_1765162401425] Task created: 145760
+[agent_0_1765162401425] Status: COMPLETE after 72s
+[agent_0_1765162401425] COMPLETED: 3354 chars
+✅ Analysis complete: 3354 chars
+
+💡 Step 2: Generating improvement proposals...
+
+=== Executing agent 1/1 ===
+[agent_0_1765162496451] Starting agent execution...
+[agent_0_1765162496451] Task created: 145762
+[agent_0_1765162496451] Status: COMPLETE after 207s
+[agent_0_1765162496451] COMPLETED: 8265 chars
+✅ Generated 1 proposals
+
+⏱️ Step 3: Benchmarking current state...
+
+=== Executing agent 1/1 ===
+[agent_0_1765162745320] Starting agent execution...
+[agent_0_1765162745320] Task created: 145763
+[agent_0_1765162745320] Status: COMPLETE after 48s
+[agent_0_1765162745320] COMPLETED: 244 chars
+✅ Benchmark: 57.8s, success=100%
+
+🔧 Step 4: Applying improvement: Optimize Agent Execution
+   Confidence: 80%
+   Impact: high
+
+⏱️ Step 3: Benchmarking current state...
+
+=== Executing agent 1/1 ===
+[agent_0_1765162803075] Starting agent execution...
+[agent_0_1765162803075] Task created: 145764
+[agent_0_1765162803075] Status: COMPLETE after 129s
+[agent_0_1765162803075] COMPLETED: 276 chars
+✅ Benchmark: 150.6s, success=100%
+
+🤔 Step 5: Comparing metrics...
+   Time: 57.8s → 150.6s
+   Success: 100% → 100%
+❌ REVERTING improvement: Optimize Agent Execution
+   Reverting via git...
+
+
+================================================================================
+🔁 ITERATION 2 (INFINITE)
+================================================================================
+
+📊 Step 1: Analyzing current code...
+
+=== Executing agent 1/1 ===
+[agent_0_1765162953627] Starting agent execution...
+[agent_0_1765162953627] Task created: 145765
+[agent_0_1765162953627] Status: COMPLETE after 60s
+[agent_0_1765162953627] COMPLETED: 58 chars
+✅ Analysis complete: 58 chars
+
+💡 Step 2: Generating improvement proposals...
+
+=== Executing agent 1/1 ===
+[agent_0_1765163032636] Starting agent execution...
+[agent_0_1765163032636] Task created: 145766
+[agent_0_1765163032636] Status: COMPLETE after 228s
+[agent_0_1765163032636] COMPLETED: 1560 chars
+✅ Generated 1 proposals
+
+⏱️ Step 3: Benchmarking current state...
+
+=== Executing agent 1/1 ===
+[agent_0_1765163305475] Starting agent execution...
+[agent_0_1765163305475] Task created: 145767
+[agent_0_1765163305475] Status: COMPLETE after 48s
+[agent_0_1765163305475] COMPLETED: 110 chars
+✅ Benchmark: 67.0s, success=100%
+
+🔧 Step 4: Applying improvement: Optimize Agent Execution
+   Confidence: 80%
+   Impact: high
+
+⏱️ Step 3: Benchmarking current state...
+
+=== Executing agent 1/1 ===
+[agent_0_1765163372442] Starting agent execution...
+[agent_0_1765163372442] Task created: 145768
+[agent_0_1765163372442] Status: COMPLETE after 45s
+[agent_0_1765163372442] COMPLETED: 41 chars
+✅ Benchmark: 54.3s, success=100%
+
+🤔 Step 5: Comparing metrics...
+   Time: 67.0s → 54.3s
+   Success: 100% → 100%
+✅ KEEPING improvement: Optimize Agent Execution
+
+📝 Committing improvement: Optimize Agent Execution
+[codegen-bot/multi-agent-orchestration-edbb7a06 a6576d8] feat: Optimize Agent Execution
+ 1 file changed, 112 insertions(+)
+ create mode 100644 improvement_loop.log
+✅ Committed improvement to git
+
+🎯 TARGET ACHIEVED after 2 iterations!
+
+
+================================================================================
+📊 FINAL RESULTS
+================================================================================
+
+Iterations completed: 2
+Improvements applied: 1
+
+✅ Applied improvements:
+  - Optimize Agent Execution
+
+📈 Performance Metrics:
+
+  Iteration 1:
+    Time: 57.8s
+    Success Rate: 100%
+    Quality Score: 8.0/10
+
+  Iteration 2:
+    Time: 54.3s
+    Success Rate: 100%
+    Quality Score: 8.0/10
+
+================================================================================
+✅ Self-Improvement Loop Complete!
+================================================================================
diff --git a/run_self_improvement.py b/run_self_improvement.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python3
+"""
+Run Self-Improvement Loop on Codegen Repository
+
+This script continuously analyzes, improves, benchmarks, and integrates
+changes to the codebase using multi-agent orchestration.
+"""
+
+import asyncio
+import sys
+from pathlib import Path
+
+# Add src to path
+sys.path.insert(0, str(Path(__file__).parent / "src"))
+
+from codegen.orchestration import SelfImprovementLoop
+
+
+async def main():
+    """Run the self-improvement loop."""
+    import sys
+
+    # Check if infinite mode requested
+    infinite = "--infinite" in sys.argv or "-i" in sys.argv
+
+    print("🚀 Starting Self-Improvement Loop for Codegen Repository")
+    print("="*80)
+    print("Target: Optimize multi-agent orchestration system")
+    print("Goal: <60s per agent, >90% success rate, production-ready CICD loop")
+    print(f"Mode: {'INFINITE ♾️ (Ctrl+C to stop)' if infinite else 'LIMITED (3 iterations)'}")
+    print("="*80)
+
+    loop = SelfImprovementLoop(
+        repo_path=".",
+        target_files=["src/codegen/orchestration.py"]
+    )
+
+    # Run infinitely if --infinite flag, otherwise 3 iterations
+    results = await loop.run_improvement_cycle(max_iterations=None if infinite else 3)
+
+    print("\n\n" + "="*80)
+    print("📊 FINAL RESULTS")
+    print("="*80)
+
+    print(f"\nIterations completed: {len(results['iterations'])}")
+    print(f"Improvements applied: {len(results['improvements_applied'])}")
+
+    if results['improvements_applied']:
+        print("\n✅ Applied improvements:")
+        for improvement in results['improvements_applied']:
+            print(f"  - {improvement}")
+
+    print("\n📈 Performance Metrics:")
+    for metric in results['metrics']:
+        print(f"\n  Iteration {metric['iteration']}:")
+        print(f"    Time: {metric['execution_time_seconds']:.1f}s")
+        print(f"    Success Rate: {metric['agent_success_rate']:.0%}")
+        print(f"    Quality Score: {metric['response_quality_score']}/10")
+
+    print("\n" + "="*80)
+    print("✅ Self-Improvement Loop Complete!")
+    print("="*80)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())