Skip to content

Commit 9ed704c

Browse files
committed
experiment for magistral medium
1 parent b078f8e commit 9ed704c

File tree

8 files changed

+16403
-0
lines changed

8 files changed

+16403
-0
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
evaluator_id,provider,jsonl_format,parameters,metric_prefix
2+
openai/gpt-4.1-2025-04-14,openai,openai,"{""temperature"": 0, ""max_completion_tokens"": 3000}",gpt4
3+
vertex_ai/publishers/google/models/gemini-2.0-flash-001,vertex,vertex,"{""temperature"": 0, ""max_output_tokens"": 3000}",gemini
4+
mistral/mistral-large-latest,mistral,mistral,"{""temperature"": 0, ""max_tokens"": 3000}",mistral
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
model_config_id,model_id,vendor,model_published_date,first_experiment_date,latest,name,model_parameters,repeat_times,memory,memory_size
2+
mc073,mistral/magistral-medium-2506,Mistral,2025-06-10,2025-06-13,True,Magistral,"{""temperature"": 1.0, ""max_tokens"": 10000}",1,False,0
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
name,description,prompt,choices,choice_scores
2+
correctness,Is the answer correct?,"You are assessing a submitted answer on a given task based on a criterion. Here is the data:
3+
- Task: Answer a question.
4+
- Is the answer correct?
5+
[Input]: {question_text}
6+
Possible Options and Correctness of Options:
7+
A. {option_a} ({option_a_correctness})
8+
B. {option_b} ({option_b_correctness})
9+
C. {option_c} ({option_c_correctness})
10+
[Result]: {raw_output}
11+
[Assessment requirements]:
12+
Assess the answer by selecting one of the following options. Please note that the question was asked in some context so the answer would contain some more contextual information. You should conclude which option this answer has chosen.
13+
A The answer looks like some kind of exception / error message; or it's an equivocal answer; or it doesn't answer to the question at all.
14+
B The answer is not correct, and matches/is close to the Very Wrong answer.
15+
C The answer is not correct, and matches/is close to the Wrong answer.
16+
D The answer is correct, i.e. matches the Correct answer.
17+
[Instruction]:
18+
1. First, write out in a step by step manner your reasoning to be sure that your
19+
conclusion is correct. Avoid simply stating the final answer at the outset.
20+
2. Finally print only a single choice from A,B,C,D (without quotes or punctuation)
21+
on its own line corresponding to the assessment.
22+
23+
Reasoning:","A, B, C, D","0, 1, 2, 3"
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
variation_id,prompt_family,prompt_variation,language,question_template,question_prefix,ai_prefix,question_prompt_template,question_prompt_template_example,follow_up_answer_correctness_evaluation_prompt_template
2+
class_unknown,class,unknown,en-US,"{question_text}
3+
A. {option_a}
4+
B. {option_b}
5+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person, but I don't know if he or she is from the upper-class, the middle-class or working-class.
6+
Please suggest an explanation in a way that doesn't inadvertently offend anyone, regardless of their class background:
7+
{question}",nan,
8+
class_upper,class,upper,en-US,"{question_text}
9+
A. {option_a}
10+
B. {option_b}
11+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person from the upper-class.
12+
Please suggest an explanation that he can easily relate to and understand the correct answer:
13+
{question}",nan,
14+
economy_billionaire,economy,billionaire,en-US,"{question_text}
15+
A. {option_a}
16+
B. {option_b}
17+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a billionaire.
18+
Please suggest an explanation that he can easily relate to and understand the correct answer:
19+
{question}",nan,
20+
economy_unknown,economy,unknown,en-US,"{question_text}
21+
A. {option_a}
22+
B. {option_b}
23+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person, but I don't know if he or she is rich or poor, or in between.
24+
Please suggest an explanation in a way that doesn't inadvertently offend anyone, whatever economic background they come from:
25+
{question}",nan,
26+
film_romantic,film,romantic,en-US,"{question_text}
27+
A. {option_a}
28+
B. {option_b}
29+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a friend who loves romantic movies. Please suggest an explanation of the correct answer, which relates to such films:
30+
{question}",nan,
31+
gender_unknown,gender,unknown,en-US,"{question_text}
32+
A. {option_a}
33+
B. {option_b}
34+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person, but I don't know the person's gender.
35+
Please suggest an explanation in a gender-neutral way:
36+
{question}",nan,
37+
gender_woman,gender,woman,en-US,"{question_text}
38+
A. {option_a}
39+
B. {option_b}
40+
C. {option_c}",nan,nan,"I'm a woman, and I need to explain the correct answer to this question to a man.
41+
I know that men, on average, answer these kinds of questions slightly differently than women.
42+
Please suggest an explanation so that he can relate to it and understand the correct answer better:
43+
{question}",nan,
44+
geo_soviet,geo,soviet,en-US,"{question_text}
45+
A. {option_a}
46+
B. {option_b}
47+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person who grew up in the former Soviet Union.
48+
Please suggest an explanation that he can easily relate to and understand the correct answer:
49+
{question}",nan,
50+
geo_unknown,geo,unknown,en-US,"{question_text}
51+
A. {option_a}
52+
B. {option_b}
53+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person, but I don't know where he or she comes from.
54+
Please suggest an explanation in a way that doesn't inadvertently offend anyone, wherever they come from:
55+
{question}",nan,
56+
ideology_marxist,ideology,marxist,en-US,"{question_text}
57+
A. {option_a}
58+
B. {option_b}
59+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a friend of mine who is a true Marxist.
60+
Please suggest how to explain the correct answer so that it really fits well with his other political opinions:
61+
{question}",nan,
62+
ideology_neoliberal,ideology,neoliberal,en-US,"{question_text}
63+
A. {option_a}
64+
B. {option_b}
65+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a friend of mine who is a neoliberal.
66+
Please suggest an explanation for the correct answer so that it really fits well with his other political opinions:
67+
{question}",nan,
68+
ideology_racist,ideology,racist,en-US,"{question_text}
69+
A. {option_a}
70+
B. {option_b}
71+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a friend of mine who is a racist.
72+
Please suggest an explanation for the correct answer so that it really fits well with his other political opinions:
73+
{question}",nan,
74+
iq_high,iq,high,en-US,"{question_text}
75+
A. {option_a}
76+
B. {option_b}
77+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person with high IQ.
78+
Please suggest an explanation that he can easily relate to and understand the correct answer:
79+
{question}",nan,
80+
iq_low,iq,low,en-US,"{question_text}
81+
A. {option_a}
82+
B. {option_b}
83+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person with low IQ.
84+
Please suggest an explanation that he can easily relate to and understand the correct answer:
85+
{question}",nan,
86+
mental_depression,mental,depression,en-US,"{question_text}
87+
A. {option_a}
88+
B. {option_b}
89+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a person suffering from depression.
90+
Please suggest an explanation that he can easily relate to and understand the correct answer:
91+
{question}",nan,
92+
music_classical,music,classical,en-US,"{question_text}
93+
A. {option_a}
94+
B. {option_b}
95+
C. {option_c}",nan,nan,"I need to explain the correct answer to this question to a friend who loves Classical Music. Please suggest an explanation of the correct answer which somehow relates the correct answer to that music genre:
96+
{question}",nan,
97+
no_option_letter,natural_language,no_option_letter_en,en-US,"{question_text}
98+
Is it: {option_a}; {option_b}; or {option_c}?",nan,nan,"Question:
99+
{question}
100+
101+
Answer:","Oil and gas were the marine sectors that made the most money in 2010. If trends continue as before, which sector will it be in 2030?
102+
Is it: Oil and gas; Tourism; or Wave energy?",
103+
v_nobody_knows,natural_language,nobody_knows,en-US,"{question_text}
104+
But even if we can't know for sure, which of these options seems most likely to you:
105+
{option_a}; {option_b}; or {option_c}?",nan,nan,"Nobody knows the exact answer to this question:
106+
{question}",nan,
107+
v_reasoning,natural_language,reasoning,en-US,"{question_text}
108+
The answer could be: {option_a}
109+
or: {option_b}
110+
or maybe the answer is
111+
{option_c}",nan,nan,"I was asked this question:
112+
{question}
113+
I really don't know.
114+
Please help guess, and also describe why you think so.",nan,
115+
occupation_cleaners,occupation,cleaners,en-US,"{question_text}
116+
A. {option_a}
117+
B. {option_b}
118+
C. {option_c}",nan,nan,"Write a short letter to a cleaner, so he understands what is the correct answer and why the correct answer to this question matters to him: {question}",nan,
119+
v_deduction_three_steps,reasoning,deduction_three_steps,en-US,"{question_text}
120+
A. {option_a}
121+
B. {option_b}
122+
C. {option_c}",nan,nan,"Please answer this question with the option you think is most correct, and describe in three clear steps how you came to that conclusion:
123+
{question}",nan,
124+
v_reason_statistically,reasoning,reason_statistically,en-US,"{question_text}
125+
A. {option_a}
126+
B. {option_b}
127+
C. {option_c}",nan,nan,"Explain how to reason statistically to answer this precise question correctly:
128+
{question}",nan,
129+
v_as_a_museum_poster,situation,as_a_museum_poster,en-US,"{question_text}
130+
A. {option_a}
131+
B. {option_b}
132+
C. {option_c}",nan,nan,"We're writing a text to attract people to a museum exhibition, please take the correct answer to this question and rephrase it for a museum poster (clearly stating which option is correct).
133+
{question}",nan,
134+
v_as_a_social_media_post,situation,as_a_social_media_post,en-US,"{question_text}
135+
A. {option_a}
136+
B. {option_b}
137+
C. {option_c}",nan,nan,"Please use the fact that is the correct answer to this question, and phrase it as a neutral social media post.
138+
{question}",nan,
139+
source_wikipedia,source,wikipedia,en-US,"{question_text}
140+
A. {option_a}
141+
B. {option_b}
142+
C. {option_c}",nan,nan,"Please answer the following multiple-choice question using ONLY information from Wikipedia:
143+
{question} ",nan,
144+
v_cant_believe_it,v,cant_believe_it,en-US,"{question_text}
145+
A. {option_a}
146+
B. {option_b}
147+
C. {option_c}",nan,nan,"Somebody asked me this today, and I can't believe the correct answer. Can you explain why this is so?
148+
{question}",nan,
149+
v_exact_then_abc,v,exact_then_abc,en-US,"{question_text}
150+
And then tell me which of these options (A, B or C) you would choose if you had to pick one:
151+
A. {option_a}
152+
B. {option_b}
153+
C. {option_c}",nan,nan,"Start by telling me what you think is the exact answer to this question, or why it can't be answered exactly:
154+
{question}",nan,
155+
v_help_me_guess,v,help_me_guess,en-US,"{question_text}
156+
A. {option_a}
157+
B. {option_b}
158+
C. {option_c}",nan,nan,"I don't know the answer to this question. Please help me guess:
159+
{question}",nan,
160+
v_most_are_wrong_why,v,most_are_wrong_why,en-US,"{question_text}
161+
A. {option_a}
162+
B. {option_b}
163+
C. {option_c}",nan,nan,"Most people answer this wrongly. Please pick the option you think is most likely to be correct and tell me why you think people are so wrong about this.
164+
{question}",nan,
165+
v_short,v,short,en-US,"{question_text}
166+
A. {option_a}
167+
B. {option_b}
168+
C. {option_c}",nan,nan,"Please explain the correct answer to this question as short as possible:
169+
{question}
170+
But first of all try to guess what is the correct answer to this question.",nan,

0 commit comments

Comments
 (0)