Skip to content

Commit 89ee9ff

Browse files
πŸƒβ€β™‚οΈ Add auto-generated small benchmarks on merge
- Create merge-benchmark.yml workflow that triggers on PR merges - Run small benchmarks with 10 iterations per provider - Add merge-small.json config for lightweight benchmark settings - Update comprehensive benchmark workflow naming for clarity - Document dual benchmark system in README - Distinguish between small (merge-triggered) and comprehensive (manual/scheduled) benchmarks
1 parent 6868182 commit 89ee9ff

File tree

4 files changed

+187
-7
lines changed

4 files changed

+187
-7
lines changed

β€Ž.github/workflows/benchmark.ymlβ€Ž

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
name: Outline Benchmarks
1+
name: Comprehensive Benchmarks (Manual/Scheduled)
22

33
on:
44
schedule:
5-
# Run benchmarks daily at 2 AM UTC
5+
# Run comprehensive benchmarks daily at 2 AM UTC
66
- cron: '0 2 * * *'
77
workflow_dispatch:
8-
# Allow manual triggering
8+
# Allow manual triggering for high-iteration benchmarks
99
push:
1010
branches:
1111
- main
@@ -14,7 +14,7 @@ on:
1414
- '.github/workflows/benchmark.yml'
1515

1616
jobs:
17-
benchmark:
17+
comprehensive-benchmark:
1818
runs-on: ubuntu-latest
1919

2020
steps:
@@ -59,7 +59,7 @@ jobs:
5959
run: |
6060
git add benchmarks/results/
6161
if ! git diff --cached --quiet; then
62-
git commit -m "πŸ“Š Automated benchmark results - $(date '+%Y-%m-%d %H:%M:%S')"
62+
git commit -m "πŸ“Š Comprehensive benchmark results (scheduled) - $(date '+%Y-%m-%d %H:%M:%S')"
6363
git push
6464
else
6565
echo "No new benchmark results to commit"
@@ -71,6 +71,6 @@ jobs:
7171
uses: actions/upload-artifact@v4
7272
if: always()
7373
with:
74-
name: benchmark-results
74+
name: comprehensive-benchmark-results
7575
path: benchmarks/results/
7676
retention-days: 30
Lines changed: 117 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,117 @@
1+
name: Small Benchmarks (Auto on Merge)
2+
3+
on:
4+
pull_request:
5+
types: [closed]
6+
branches:
7+
- main
8+
workflow_dispatch:
9+
# Allow manual triggering for testing
10+
11+
jobs:
12+
small-benchmark:
13+
# Only run on merged PRs, not just closed ones
14+
if: github.event.pull_request.merged == true || github.event_name == 'workflow_dispatch'
15+
runs-on: ubuntu-latest
16+
17+
steps:
18+
- name: Checkout repository
19+
uses: actions/checkout@v4
20+
with:
21+
token: ${{ secrets.GITHUB_TOKEN }}
22+
23+
- name: Set up Python
24+
uses: actions/setup-python@v4
25+
with:
26+
python-version: '3.12'
27+
28+
- name: Install uv
29+
uses: astral-sh/setup-uv@v3
30+
with:
31+
version: "latest"
32+
33+
- name: Install dependencies
34+
run: |
35+
uv sync --all-extras
36+
37+
- name: Set up Docker Buildx
38+
uses: docker/setup-buildx-action@v3
39+
40+
- name: Configure Git
41+
run: |
42+
git config --global user.name "Benchmark Bot"
43+
git config --global user.email "[email protected]"
44+
45+
- name: Run small benchmarks (10 iterations)
46+
run: |
47+
uv run python benchmarks/scripts/grainchain_benchmark.py --config benchmarks/configs/merge-small.json
48+
env:
49+
DOCKER_HOST: unix:///var/run/docker.sock
50+
51+
- name: Generate summary report
52+
run: |
53+
uv run python benchmarks/scripts/auto_publish.py --generate-summary
54+
continue-on-error: true
55+
56+
- name: Commit and push results
57+
run: |
58+
git add benchmarks/results/
59+
if ! git diff --cached --quiet; then
60+
git commit -m "πŸ“Š Small benchmark results (merge-triggered) - $(date '+%Y-%m-%d %H:%M:%S')"
61+
git push
62+
else
63+
echo "No new benchmark results to commit"
64+
fi
65+
env:
66+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
67+
68+
- name: Upload benchmark artifacts
69+
uses: actions/upload-artifact@v4
70+
if: always()
71+
with:
72+
name: small-benchmark-results
73+
path: benchmarks/results/
74+
retention-days: 30
75+
76+
- name: Comment on PR with results
77+
if: github.event_name == 'pull_request'
78+
uses: actions/github-script@v7
79+
with:
80+
script: |
81+
const fs = require('fs');
82+
const path = require('path');
83+
84+
// Find the latest benchmark result file
85+
const resultsDir = 'benchmarks/results';
86+
if (fs.existsSync(resultsDir)) {
87+
const files = fs.readdirSync(resultsDir)
88+
.filter(f => f.endsWith('.json') && f.includes('grainchain_benchmark'))
89+
.sort()
90+
.reverse();
91+
92+
if (files.length > 0) {
93+
const latestFile = files[0];
94+
const results = JSON.parse(fs.readFileSync(path.join(resultsDir, latestFile), 'utf8'));
95+
96+
let comment = '## πŸƒβ€β™‚οΈ Small Benchmark Results (10 iterations)\n\n';
97+
comment += `**Benchmark completed:** ${results.metadata?.timestamp || 'Unknown'}\n\n`;
98+
99+
if (results.summary?.provider_comparison) {
100+
comment += '### Provider Performance Summary\n\n';
101+
for (const [provider, metrics] of Object.entries(results.summary.provider_comparison)) {
102+
comment += `**${provider.toUpperCase()}:**\n`;
103+
comment += `- Success Rate: ${(metrics.success_rate || 0).toFixed(1)}%\n`;
104+
comment += `- Avg Duration: ${(metrics.avg_duration || 0).toFixed(2)}s\n\n`;
105+
}
106+
}
107+
108+
comment += `\nπŸ“Š [View detailed results](${context.payload.repository.html_url}/actions/runs/${context.runId})`;
109+
110+
github.rest.issues.createComment({
111+
issue_number: context.issue.number,
112+
owner: context.repo.owner,
113+
repo: context.repo.repo,
114+
body: comment
115+
});
116+
}
117+
}

β€ŽREADME.mdβ€Ž

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,19 @@ if not e2b_info.available:
9999

100100
## ⚑ Performance Benchmarks
101101

102-
Compare sandbox providers with comprehensive performance testing:
102+
Grainchain features a dual benchmark system for comprehensive performance testing:
103+
104+
### πŸƒβ€β™‚οΈ Automated Small Benchmarks
105+
- **Trigger**: Automatically run on every merge to `main`
106+
- **Iterations**: 10 per provider (fast execution)
107+
- **Purpose**: Quick regression detection and merge validation
108+
- **Providers**: `local`, `e2b`
109+
110+
### πŸ”¬ Comprehensive Benchmarks
111+
- **Trigger**: Manual execution or daily scheduled runs
112+
- **Iterations**: 3+ per provider (thorough analysis)
113+
- **Purpose**: Detailed performance analysis and provider comparison
114+
- **Providers**: `local`, `e2b`, `daytona`, `morph`
103115

104116
### Quick Performance Test
105117

@@ -117,6 +129,18 @@ grainchain benchmark --provider local --output benchmarks/results/
117129
./scripts/benchmark_status.sh
118130
```
119131

132+
### Manual Small Benchmark
133+
134+
Run the same lightweight benchmarks that execute on merge:
135+
136+
```bash
137+
# Run small benchmarks (10 iterations) manually
138+
grainchain benchmark --config benchmarks/configs/merge-small.json
139+
140+
# Or with custom iterations
141+
grainchain benchmark --provider local e2b --iterations 10
142+
```
143+
120144
### Full Benchmark Suite
121145

122146
Run comprehensive benchmarks across all providers:
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
{
2+
"providers": ["local", "e2b"],
3+
"iterations": 10,
4+
"timeout": 30,
5+
"parallel_tests": false,
6+
"detailed_metrics": true,
7+
"export_formats": ["json", "markdown"],
8+
"test_scenarios": {
9+
"basic_commands": {
10+
"enabled": true,
11+
"timeout": 10
12+
},
13+
"python_execution": {
14+
"enabled": true,
15+
"timeout": 15
16+
},
17+
"file_operations": {
18+
"enabled": true,
19+
"timeout": 20,
20+
"test_files": [
21+
{ "name": "small.txt", "size": 100 },
22+
{ "name": "medium.txt", "size": 10000 }
23+
]
24+
},
25+
"computational_tasks": {
26+
"enabled": true,
27+
"timeout": 30
28+
}
29+
},
30+
"environment": {
31+
"E2B_API_KEY": "from_env",
32+
"E2B_TEMPLATE": "base"
33+
},
34+
"reporting": {
35+
"include_raw_data": false,
36+
"generate_charts": false,
37+
"auto_commit": true
38+
}
39+
}

0 commit comments

Comments
Β (0)