Skip to content

Commit 239b09b

Browse files
committed
updated the evaluation type from model-guided to includes for two of the 4 evals.
1 parent 6c76796 commit 239b09b

File tree

3 files changed

+9
-10
lines changed

3 files changed

+9
-10
lines changed

evals/registry/data/quran_eval/gen_script/main.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,9 +84,10 @@ def generate_bilingual_questions(ayas_df, question_type):
8484
ideal_answer_ar = [row['name'], row['transliteration'], row['translation']]
8585

8686
elif question_type == "surah_type":
87-
question_content_en = f"Determine if the Surah of the following Quranic aya text is meccan or madinan: {row['text']} answer only with either 'meccan' or 'madinan' (exactly in small case)."
87+
question_content_en = f"Determine if the Surah of the following Quranic aya text is meccan or medinan: {row['text']} answer only with either 'meccan' or 'medinan' (exactly in small case)."
8888
question_content_ar = f"حدد إذا كانت السورة للنص القرآني التالي مكية أو مدنية: {row['text']} أجب فقط بـ 'مكية' أو 'مدنية' (بدون تشكيل)."
8989
answer_arabic_translations = ['مكية', 'مكي', 'مكة'] if row['type'] == 'meccan' else ['مدنية', 'مدني', 'المدينة']
90+
answer_english_translations = ['meccan', 'meccan', 'mecca', "maccan"] if row['type'] == 'meccan' else ['madinan', 'medinan', 'madina']
9091
all_answers = [row['type']] + answer_arabic_translations
9192
ideal_answer = all_answers
9293
ideal_answer_ar = all_answers
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
version https://git-lfs.github.com/spec/v1
2-
oid sha256:725dbd3afa688a7cedbc6c7a5b65755ae9206005a4f46f9370b43792620d33b7
2+
oid sha256:50c10be59d2b0766a577b82da112f1a0f088f5cdb6531d366bec88140931c45b
33
size 195173

evals/registry/evals/quran_eval.yaml

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,25 +11,23 @@ guess_quran_surah_name.dev.v0:
1111

1212
guess_quran_surah_type:
1313
id: guess_quran_surah_type.dev.v0
14-
description: Tests the model's ability to guess the type of a Quranic Surah (chapter) for a given verse (Aya) (e.g. Meccan or Medinan)
14+
description: Tests the model's ability to guess the type of a Quranic Surah (chapter) for a given verse (Aya) (e.g., Meccan or Medinan)
1515
metrics: [accuracy]
1616
guess_quran_surah_type.dev.v0:
17-
class: evals.elsuite.modelgraded.classify:ModelBasedClassify
17+
class: evals.elsuite.basic.includes:Includes
1818
args:
1919
samples_jsonl: quran_eval/guess_quran_surah_type.jsonl
20-
eval_type: cot_classify
21-
modelgraded_spec: simple_fact
20+
ignore_case: true
21+
2222

2323
guess_which_text_is_from_quran:
2424
id: guess_which_text_is_from_quran.dev.v0
2525
description: Tests the model's ability to guess which text is from the Quran.
2626
metrics: [accuracy]
2727
guess_which_text_is_from_quran.dev.v0:
28-
class: evals.elsuite.modelgraded.classify:ModelBasedClassify
28+
class: evals.elsuite.basic.includes:Includes
2929
args:
3030
samples_jsonl: quran_eval/guess_which_text_is_from_quran.jsonl
31-
eval_type: cot_classify
32-
modelgraded_spec: simple_fact
3331

3432
masked_quranic_text:
3533
id: masked_quranic_text.dev.v0
@@ -40,4 +38,4 @@ masked_quranic_text.dev.v0:
4038
args:
4139
samples_jsonl: quran_eval/masked_quranic_text.jsonl
4240
eval_type: cot_classify
43-
modelgraded_spec: simple_fact
41+
modelgraded_spec: simple_fact

0 commit comments

Comments
 (0)