1+ description: Grade School Math -- for every problem we generate a plan, then exectute and evaluate it.
2+ defs:
3+ problems:
4+ read: ./test.jsonl
5+ parser: jsonl
6+
7+ MAX_ITERATIONS: 50
8+
9+ planning:
10+ function:
11+ problem: str
12+ return:
13+ text:
14+ - >
15+ Please generate a high-level plan for solving the following question.
16+ As the first step, just say what method and idea you will use to solve the question.
17+ You can reorganize the information in the question. Do not do the actual calculation.
18+ Keep your response concise and within 80 words.
19+ Question:
20+ - ${ problem }
21+ - "\nThe plan is:\n"
22+ - model: ollama/granite3.2:8b
23+
24+ solve:
25+ function:
26+ plan: str
27+ return:
28+ text:
29+ - ${ plan }
30+ - >
31+ The plan looks good! Now, use real numbers and do the calculation. Please solve the question
32+ step-by-step according to the high-level plan. Give me the final answer. Make your response short.
33+ - "\nThe answer is:\n"
34+ - model: ollama/granite3.2:8b
35+
36+ extract_final_answer:
37+ function:
38+ solution: str
39+ return:
40+ lastOf:
41+ - ${ solution }
42+ - Extract the result from the above solution into a JSON object with field "result" and a float as value. Remove any dollar signs or other symbols.
43+ - model: ollama/granite3.2:8b
44+ parser: json
45+ def: result
46+ fallback:
47+ data:
48+ result: 0
49+ - if: ${ result[0] } # result was a list accidentally
50+ then:
51+ ${ result[0] }
52+ fallback:
53+ ${ result }
54+
55+ compare_to_ground_truth:
56+ function:
57+ result: obj
58+ truth: str
59+ return:
60+ lastOf:
61+ - data: ${ truth }
62+ parser:
63+ regex: "(.|\n)*#### (?P<answer>([0-9])*)\n*"
64+ spec:
65+ answer: str
66+ def: ground_truth
67+ - if: ${ result.result|float == ground_truth.answer|float}
68+ then:
69+ 1
70+ else:
71+ 0
72+
73+ text:
74+ - for:
75+ problem: ${ problems }
76+ repeat:
77+ call: ${ planning }
78+ args:
79+ pdl_context: []
80+ problem: ${ problem.question }
81+ max_iterations: ${ MAX_ITERATIONS }
82+ def: plans
83+ join:
84+ as: array
85+
86+ - for:
87+ plan: ${ plans }
88+ repeat:
89+ call: ${ solve }
90+ args:
91+ pdl_context: []
92+ plan: ${ plan }
93+ max_iterations: ${ MAX_ITERATIONS }
94+ def: solutions
95+ join:
96+ as: array
97+
98+ - for:
99+ solution: ${ solutions }
100+ repeat:
101+ call: ${ extract_final_answer }
102+ args:
103+ pdl_context: []
104+ solution: ${ solution }
105+ max_iterations: ${ MAX_ITERATIONS }
106+ def: results
107+ join:
108+ as: array
109+
110+ - for:
111+ result: ${ results }
112+ problem: ${ problems[:MAX_ITERATIONS] }
113+ repeat:
114+ call: ${ compare_to_ground_truth }
115+ args:
116+ pdl_context: []
117+ result: ${ result }
118+ truth: ${ problem.answer }
119+ max_iterations: ${ MAX_ITERATIONS }
120+ def: stats
121+ join:
122+ as: array
123+
124+ - "\nAccuracy: ${ stats|sum / MAX_ITERATIONS * 100}% "
0 commit comments