You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/concepts/operators.md
+26-3Lines changed: 26 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -155,6 +155,7 @@ To enable gleaning, specify:
155
155
156
156
- `validation_prompt`: Instructions for the LLM to evaluate and improve the output.
157
157
- `num_rounds`: The maximum number of refinement iterations.
158
+
- `model` (optional): The model to use for the LLM executing the validation prompt. Defaults to the model specified for this operation. **Note that if the validator LLM determines the output needs to be improved, the final output will be generated by the model specified for this operation.**
158
159
159
160
Example:
160
161
@@ -170,6 +171,30 @@ gleaning:
170
171
171
172
This approach allows for _context-aware_ validation and refinement of LLM outputs. Note that it is expensive, since it at least doubles the number of LLM calls required for each operator.
172
173
174
+
Example map operation (with a different model for the validation prompt):
175
+
176
+
```yaml
177
+
- name: extract_insights
178
+
type: map
179
+
model: gpt-4o
180
+
prompt: |
181
+
From the user log below, list 2-3 concise insights (1-2 words each) and 1-2 supporting actions per insight.
182
+
Return as a list of dictionaries with 'insight' and 'supporting_actions'.
183
+
Log: {{ input.log }}
184
+
output:
185
+
schema:
186
+
insights_summary: "string"
187
+
gleaning:
188
+
num_rounds: 2 # Will refine the output up to 2 times, if the judge LLM (gpt-4o-mini) suggests improvements
189
+
model: gpt-4o-mini
190
+
validation_prompt: |
191
+
There should be at least 2 insights, and each insight should have at least 1 supporting action.
192
+
```
193
+
194
+
!!! tip "Choosing a Different Model for Validation"
195
+
196
+
In the example above, the `gpt-4o` model is used to generate the main outputs, while the `gpt-4o-mini` model is used only for the validation and refinement steps. This means the more powerful (and expensive) model produces the final output, but a less expensive model handles the iterative validation, helping to reduce costs without sacrificing output quality.
197
+
173
198
### How Gleaning Works
174
199
175
200
Gleaning is an iterative process that refines LLM outputs using context-aware validation. Here's how it works:
@@ -178,7 +203,7 @@ Gleaning is an iterative process that refines LLM outputs using context-aware va
178
203
179
204
2. **Validation**: The validation prompt is appended to the chat thread, along with the original operation prompt and output. This is submitted to the LLM. _Note that the validation prompt doesn't need any variables, since it's appended to the chat thread._
180
205
181
-
3. **Assessment**: The LLM responds with an assessment of the output according to the validation prompt.
206
+
3. **Assessment**: The LLM responds with an assessment of the output according to the validation prompt. The model used for this step is specified by the `model` field in the `gleaning` dictionary field, or defaults to the model specified for that operation.
182
207
183
208
4. **Decision**: The system interprets the assessment:
184
209
@@ -197,6 +222,4 @@ Gleaning is an iterative process that refines LLM outputs using context-aware va
197
222
198
223
7. **Final Output**: The last refined output is returned.
199
224
200
-
This process allows for nuanced, context-aware validation and refinement of LLM outputs. It's particularly useful for complex tasks where simple rule-based validation might miss subtleties or context-dependent aspects of the output.
201
-
202
225
Note that gleaning can significantly increase the number of LLM calls for each operator, potentially doubling it at minimum. While this increases cost and latency, it can lead to higher quality outputs for complex tasks.
0 commit comments