Add special codegen outline benchmark #16
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR implements the special codegen benchmark requested in CG-18529. The benchmark uses the provided codegen.com Dockerfile as a base image and tests the outline repository workflow with E2B and Daytona providers.
Features Added
🚀 New CLI Command
🔧 Comprehensive Benchmark Workflow
https://github.com/codegen-sh/outline.git📊 Rich Reporting
🐳 Custom Dockerfile
benchmarks/dockerfiles/codegen-base.dockerfilewith the provided Dockerfile📚 Documentation
benchmarks/CODEGEN_BENCHMARK.mdUsage Examples
Key Implementation Details
Files Added/Modified
New Files
grainchain/cli/codegen_benchmark.py- Core benchmark implementationbenchmarks/dockerfiles/codegen-base.dockerfile- Codegen base imagebenchmarks/configs/codegen_outline.json- Benchmark configurationbenchmarks/CODEGEN_BENCHMARK.md- Comprehensive documentationModified Files
grainchain/cli/main.py- Added--codegenflaggrainchain/cli/benchmark.py- Added codegen benchmark routingREADME.md- Updated with new benchmark examplesTesting
✅ CLI Integration: New
--codegenflag works correctly✅ Backward Compatibility: Existing benchmarks still work
✅ Error Handling: Graceful failure when API keys missing
✅ Report Generation: JSON and Markdown reports generated correctly
Requirements
E2B_API_KEYenvironment variable)DAYTONA_API_KEYenvironment variable)Future Enhancements
This implementation provides a solid foundation for:
Resolves CG-18529
💻 View my work • About Codegen