Skip to content

Commit 592c289

Browse files
authored
add tasks.yml for microsoft/phi-4 models (#18)
* add tasks for phi-4 models * fix task/metric names * correct gsm8k metric name
1 parent 2507a89 commit 592c289

File tree

4 files changed

+120
-0
lines changed

4 files changed

+120
-0
lines changed
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
tasks:
2+
- name: arc_challenge
3+
metrics:
4+
- name: acc_norm,none
5+
value: 0.6425
6+
7+
- name: gsm8k
8+
metrics:
9+
- name: exact_match,strict-match
10+
value: 0.9067
11+
12+
- name: hellaswag
13+
metrics:
14+
- name: acc_norm,none
15+
value: 0.8419
16+
17+
- name: mmlu
18+
metrics:
19+
- name: acc,none
20+
value: 0.803
21+
22+
- name: truthfulqa_mc2
23+
metrics:
24+
- name: acc,none
25+
value: 0.5954
26+
27+
- name: winogrande
28+
metrics:
29+
- name: acc,none
30+
value: 0.7987
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
tasks:
2+
- name: arc_challenge
3+
metrics:
4+
- name: acc_norm,none
5+
value: 0.6288
6+
7+
- name: gsm8k
8+
metrics:
9+
- name: exact_match,strict-match
10+
value: 0.8969
11+
12+
- name: hellaswag
13+
metrics:
14+
- name: acc_norm,none
15+
value: 0.8342
16+
17+
- name: mmlu
18+
metrics:
19+
- name: acc,none
20+
value: 0.7987
21+
22+
- name: truthfulqa_mc2
23+
metrics:
24+
- name: acc,none
25+
value: 0.5918
26+
27+
- name: winogrande
28+
metrics:
29+
- name: acc,none
30+
value: 0.8074
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
tasks:
2+
- name: arc_challenge
3+
metrics:
4+
- name: acc_norm,none
5+
value: 0.6433
6+
7+
- name: gsm8k
8+
metrics:
9+
- name: exact_match,strict-match
10+
value: 0.903
11+
12+
- name: hellaswag
13+
metrics:
14+
- name: acc_norm,none
15+
value: 0.843
16+
17+
- name: mmlu
18+
metrics:
19+
- name: acc,none
20+
value: 0.8039
21+
22+
- name: truthfulqa_mc2
23+
metrics:
24+
- name: acc,none
25+
value: 0.5882
26+
27+
- name: winogrande
28+
metrics:
29+
- name: acc,none
30+
value: 0.7995

microsoft/phi-4/accuracy/tasks.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
tasks:
2+
- name: arc_challenge
3+
metrics:
4+
- name: acc_norm,none
5+
value: 0.6442
6+
7+
- name: gsm8k
8+
metrics:
9+
- name: exact_match,strict-match
10+
value: 0.9007
11+
12+
- name: hellaswag
13+
metrics:
14+
- name: acc_norm,none
15+
value: 0.8437
16+
17+
- name: mmlu
18+
metrics:
19+
- name: acc,none
20+
value: 0.803
21+
22+
- name: truthfulqa_mc2
23+
metrics:
24+
- name: acc,none
25+
value: 0.5937
26+
27+
- name: winogrande
28+
metrics:
29+
- name: acc,none
30+
value: 0.8058

0 commit comments

Comments
 (0)