1+ description: Grade School Math -- for every problem we generate a plan, then exectute and evaluate it.
2+ defs:
3+ problems:
4+ read: ./test.jsonl
5+ parser: jsonl
6+
7+ MAX_ITERATIONS: 50
8+
9+ planning:
10+ function:
11+ problem: str
12+ return:
13+ text:
14+ - >
15+ Please generate a high-level plan for solving the following question.
16+ As the first step, just say what method and idea you will use to solve the question.
17+ You can reorganize the information in the question. Do not do the actual calculation.
18+ Keep your response concise and within 80 words.
19+ Question:
20+ - ${ problem }
21+ - "\nThe plan is:\n"
22+ - model: ollama/granite3.2:8b
23+
24+ solve:
25+ function:
26+ plan: str
27+ return:
28+ text:
29+ - ${ plan }
30+ - >
31+ The plan looks good! Now, use real numbers and do the calculation. Please solve the question
32+ step-by-step according to the high-level plan. Give me the final answer. Make your response short.
33+ - "\nThe answer is:\n"
34+ - model: ollama/granite3.2:8b
35+
36+ extract_final_answer:
37+ function:
38+ solution: str
39+ return:
40+ lastOf:
41+ - ${ solution }
42+ - Extract the result from the above solution into a JSON object with field "result" and a float as value. Remove any dollar signs or other symbols.
43+ - model: ollama/granite3.2:8b
44+ parser: json
45+ def: result
46+ spec: { "result": float }
47+ fallback:
48+ data:
49+ result: 0
50+
51+ compare_to_ground_truth:
52+ function:
53+ result: obj
54+ truth: str
55+ return:
56+ lastOf:
57+ - data: ${ truth }
58+ parser:
59+ regex: "(.|\n)*#### (?P<answer>([0-9])*)\n*"
60+ spec:
61+ answer: str
62+ def: ground_truth
63+ - if: ${ result.result|float == ground_truth.answer|float}
64+ then:
65+ 1
66+ else:
67+ 0
68+
69+ text:
70+ - for:
71+ problem: ${ problems }
72+ repeat:
73+ call: ${ planning }
74+ args:
75+ pdl_context: []
76+ problem: ${ problem.question }
77+ max_iterations: ${ MAX_ITERATIONS }
78+ def: plans
79+ join:
80+ as: array
81+
82+ - for:
83+ plan: ${ plans }
84+ repeat:
85+ call: ${ solve }
86+ args:
87+ pdl_context: []
88+ plan: ${ plan }
89+ max_iterations: ${ MAX_ITERATIONS }
90+ def: solutions
91+ join:
92+ as: array
93+
94+ - for:
95+ solution: ${ solutions }
96+ repeat:
97+ call: ${ extract_final_answer }
98+ args:
99+ pdl_context: []
100+ solution: ${ solution }
101+ max_iterations: ${ MAX_ITERATIONS }
102+ def: results
103+ join:
104+ as: array
105+
106+ - for:
107+ result: ${ results }
108+ problem: ${ problems[:MAX_ITERATIONS] }
109+ repeat:
110+ call: ${ compare_to_ground_truth }
111+ args:
112+ pdl_context: []
113+ result: ${ result }
114+ truth: ${ problem.answer }
115+ max_iterations: ${ MAX_ITERATIONS }
116+ def: stats
117+ join:
118+ as: array
119+
120+ - "\nAccuracy: ${ stats|sum / MAX_ITERATIONS * 100}% "
0 commit comments