Skip to content

Commit a5a7882

Browse files
committed
chore(evals): refresh personas CSV/MD after BoN=2, o3-mini critic run
1 parent ffcc715 commit a5a7882

File tree

2 files changed

+20
-10
lines changed

2 files changed

+20
-10
lines changed

reports/personas.csv

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
scenario_id,count,avg_critic_score,avg_overlap,avg_latency_ms
2-
seed_ceo_focus,1,9.0,0.0,12843.744039535522
3-
series_a_ceo_retention,1,9.0,0.0,14227.832794189453
4-
consumer_app_activation,1,10.0,0.0,22965.211868286133
5-
enterprise_pipeline,1,9.0,0.0,11418.672800064087
6-
pricing_reframe,1,9.0,0.0,10158.666133880615
2+
seed_ceo_focus,1,9.0,0.0,60.81581115722656
3+
series_a_ceo_retention,1,9.0,0.0,13.698101043701172
4+
consumer_app_activation,1,10.0,0.0,14.04881477355957
5+
enterprise_pipeline,1,9.0,0.0,14.264106750488281
6+
pricing_reframe,1,9.0,0.0,13.95106315612793
7+
devtools_oss_adoption,1,9.0,0.0,12239.256143569946
8+
fintech_compliance_blocker,1,9.0,0.0,11795.646905899048
9+
healthcare_baa_go_to_market,1,9.0,0.0,10059.163093566895
10+
marketplace_cold_start,1,9.0,0.0,18546.04697227478
11+
ml_infra_pilot_to_contract,1,9.0,0.0,10304.81505393982

reports/personas.md

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,12 @@
11
| Scenario | Count | Avg Score | Overlap | Latency (ms) |
22
|---|---:|---:|---:|---:|
3-
| seed_ceo_focus | 1 | 9.00 | 0.00 | 12844 |
4-
| series_a_ceo_retention | 1 | 9.00 | 0.00 | 14228 |
5-
| consumer_app_activation | 1 | 10.00 | 0.00 | 22965 |
6-
| enterprise_pipeline | 1 | 9.00 | 0.00 | 11419 |
7-
| pricing_reframe | 1 | 9.00 | 0.00 | 10159 |
3+
| seed_ceo_focus | 1 | 9.00 | 0.00 | 61 |
4+
| series_a_ceo_retention | 1 | 9.00 | 0.00 | 14 |
5+
| consumer_app_activation | 1 | 10.00 | 0.00 | 14 |
6+
| enterprise_pipeline | 1 | 9.00 | 0.00 | 14 |
7+
| pricing_reframe | 1 | 9.00 | 0.00 | 14 |
8+
| devtools_oss_adoption | 1 | 9.00 | 0.00 | 12239 |
9+
| fintech_compliance_blocker | 1 | 9.00 | 0.00 | 11796 |
10+
| healthcare_baa_go_to_market | 1 | 9.00 | 0.00 | 10059 |
11+
| marketplace_cold_start | 1 | 9.00 | 0.00 | 18546 |
12+
| ml_infra_pilot_to_contract | 1 | 9.00 | 0.00 | 10305 |

0 commit comments

Comments
 (0)