Skip to content

Commit 427e8cf

Browse files
psriramsncvipul-mittalzephyrzilla
authored
[Enhancement] Add Self Refinement Feature (#105)
* Added Self Refinement recipe and example task * Updated Documentation for Self Refinement recipe --------- Co-authored-by: Vipul Mittal <118464422+vipul-mittal@users.noreply.github.com> Co-authored-by: Surajit Dasgupta <surajit.dasgupta@servicenow.com>
1 parent 5bbd733 commit 427e8cf

File tree

6 files changed

+461
-0
lines changed

6 files changed

+461
-0
lines changed

docs/features/self_refinement.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
# Self Refinement
2+
3+
> **Iterative candidate improvement with judge scores and a reflection trajectory (as a reusable subgraph)**
4+
5+
## Overview
6+
7+
Self Refinement is a workflow pattern where SyGra:
8+
9+
- Generates a candidate output for a given input prompt
10+
- Uses a separate *judge* step to score and critique the candidate
11+
- Optionally refines the candidate using the critique
12+
- Repeats until the candidate is accepted (meets a score threshold) or a maximum number of iterations is reached
13+
14+
This feature is implemented as a reusable **subgraph recipe**: `sygra.recipes.self_refinement`.
15+
16+
---
17+
18+
## What you get
19+
20+
When used in a task, the self-refinement subgraph produces:
21+
22+
- `candidate`: the final accepted (or last) candidate
23+
- `self_refine_judge`: judge output containing score, pass/fail, critique, and raw parsed output
24+
- `reflection_trajectory`: an ordered list capturing each iteration’s candidate + judge result (useful for training / analysis)
25+
26+
### Example output shape
27+
28+
```json
29+
{
30+
"candidate": "...final candidate...",
31+
"self_refine_judge": {
32+
"score": 4,
33+
"is_good": true,
34+
"critique": "...",
35+
"raw": {"score": 4, "is_good": true, "critique": "..."}
36+
},
37+
"reflection_trajectory": [
38+
{
39+
"iteration": 0,
40+
"candidate_key": "candidate",
41+
"candidate": "...candidate at iteration 0...",
42+
"judge": {"score": 2, "is_good": false, "critique": "...", "raw": {"...": "..."}}
43+
},
44+
{
45+
"iteration": 1,
46+
"candidate_key": "candidate",
47+
"candidate": "...candidate at iteration 1...",
48+
"judge": {"score": 4, "is_good": true, "critique": "...", "raw": {"...": "..."}}
49+
}
50+
]
51+
}
52+
```
53+
54+
---
55+
56+
## How it works (recipe)
57+
58+
The recipe lives at:
59+
60+
- `sygra/recipes/self_refinement/graph_config.yaml`
61+
- `sygra/recipes/self_refinement/task_executor.py`
62+
63+
### Nodes
64+
65+
- `init_self_refine` (lambda)
66+
- Initializes internal loop state and `reflection_trajectory`
67+
- Uses internal state keys prefixed with `_self_refine_*` to minimize collision risk in host graphs
68+
- `generate_candidate` (llm)
69+
- Creates the initial candidate from `{prompt}`
70+
- `judge_candidate` (llm + post-process)
71+
- Returns JSON with `{score, is_good, critique}`
72+
- Post-processor normalizes and writes:
73+
- `self_refine_judge`
74+
- `self_refine_is_good`
75+
- `self_refine_critique`
76+
- `update_trajectory` (lambda)
77+
- Appends `{candidate, judge, iteration}` to `reflection_trajectory`
78+
- `refine_candidate` (llm)
79+
- Produces an improved candidate using `{self_refine_critique}`
80+
81+
### Looping behavior
82+
83+
A conditional edge (`SelfRefinementLoopCondition`) ends the subgraph when either:
84+
85+
- The judge accepts (`self_refine_is_good == True`), or
86+
- The maximum number of refinement iterations is reached
87+
88+
---
89+
90+
## Using self refinement as a subgraph in a task
91+
92+
A typical pattern is to add a `subgraph` node in your task and point it at the recipe.
93+
94+
The example task is located at:
95+
96+
- `tasks/examples/self_refinement/graph_config.yaml`
97+
- `tasks/examples/self_refinement/input.json`
98+
99+
### Example task config (high level)
100+
101+
In `tasks/examples/self_refinement/graph_config.yaml`, the task defines a single node:
102+
103+
- `self_refine`:
104+
- `node_type: subgraph`
105+
- `subgraph: sygra.recipes.self_refinement`
106+
- Overrides recipe node config via `node_config_map` (e.g., model selection, temperatures, and loop parameters)
107+
108+
It also maps the key outputs into the final dataset using `output_config.output_map`:
109+
110+
- `candidate` from `candidate`
111+
- `self_refine_judge` from `self_refine_judge`
112+
- `reflection_trajectory` from `reflection_trajectory`
113+
114+
---
115+
116+
## Running the example
117+
118+
From the repo root:
119+
120+
```bash
121+
python main.py --task examples.self_refinement --num_records 2
122+
```
123+
124+
Artifacts:
125+
126+
- Output records are written under `tasks/examples/self_refinement/` (typically `output_<timestamp>.json`)
127+
- Metadata is written under `tasks/examples/self_refinement/metadata/` (typically `metadata_tasks_examples_self_refinement_<timestamp>.json`)
128+
129+
The metadata file includes per-node execution statistics (e.g., `self_refine.generate_candidate`, `self_refine.judge_candidate`) and aggregate token/cost summaries.
130+
131+
---
132+
133+
## Customization
134+
135+
You can tune behavior either in the recipe or (recommended) from the task via `node_config_map`:
136+
137+
- `init_self_refine.params.max_iterations`
138+
- `init_self_refine.params.score_threshold`
139+
- `judge_candidate.model.parameters.temperature` (usually `0` for consistent scoring)
140+
- Prompts for `generate_candidate`, `judge_candidate`, and `refine_candidate`
141+
142+
If you need the recipe to write outputs under different keys (to avoid collisions in a larger host graph), you can override:
143+
144+
- `update_trajectory.params.candidate_key`
145+
- `update_trajectory.params.judge_key`

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ nav:
8383
- Structured Output: concepts/structured_output/README.md
8484
- Features:
8585
- Metadata Tracking: features/metadata_tracking.md
86+
- Self Refinement: features/self_refinement.md
8687
- Tutorials:
8788
- Agent Simulation: tutorials/agent_simulation_tutorial.md
8889
- Agent Simulation with Tools: tutorials/agent_tool_simulation_tutorial.md
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
graph_config:
2+
nodes:
3+
init_self_refine:
4+
node_type: lambda
5+
lambda: sygra.recipes.self_refinement.task_executor.InitSelfRefinementState
6+
params:
7+
max_iterations: 2
8+
score_threshold: 4
9+
output_keys:
10+
- _self_refine_iteration
11+
- _self_refine_max_iterations
12+
- _self_refine_score_threshold
13+
- reflection_trajectory
14+
15+
generate_candidate:
16+
node_type: llm
17+
output_keys: candidate
18+
prompt:
19+
- system: |
20+
You are a helpful assistant.
21+
- user: |
22+
{prompt}
23+
model:
24+
name: gpt-4o-mini
25+
parameters:
26+
temperature: 0.7
27+
28+
judge_candidate:
29+
node_type: llm
30+
post_process: sygra.recipes.self_refinement.task_executor.SelfRefinementJudgePostProcessor
31+
output_keys:
32+
- self_refine_judge
33+
- self_refine_is_good
34+
- self_refine_critique
35+
prompt:
36+
- system: |
37+
You are a strict judge. Evaluate the candidate against the user's request.
38+
Return JSON only.
39+
- user: |
40+
User request:
41+
{prompt}
42+
43+
Candidate:
44+
{candidate}
45+
46+
Score the candidate from 1-5.
47+
If score >= 4 it is good.
48+
49+
OUTPUT JSON:
50+
{{"score": <int>, "is_good": <true/false>, "critique": "..."}}
51+
model:
52+
name: gpt-4o-mini
53+
parameters:
54+
temperature: 0
55+
structured_output:
56+
schema:
57+
fields:
58+
score:
59+
type: float
60+
description: "judge quality score of the candidate"
61+
is_good:
62+
type: bool
63+
description: "Boolean value whether the candidate is good or bad"
64+
critique:
65+
type: str
66+
description: "Judge's critique description of the candidate"
67+
68+
update_trajectory:
69+
node_type: lambda
70+
lambda: sygra.recipes.self_refinement.task_executor.UpdateSelfRefinementTrajectory
71+
params:
72+
candidate_key: candidate
73+
judge_key: self_refine_judge
74+
75+
refine_candidate:
76+
node_type: llm
77+
output_keys: candidate
78+
prompt:
79+
- system: |
80+
You improve the candidate based on critique. Return only the improved candidate.
81+
- user: |
82+
User request:
83+
{prompt}
84+
85+
Previous candidate:
86+
{candidate}
87+
88+
Critique:
89+
{self_refine_critique}
90+
91+
Produce an improved candidate.
92+
model:
93+
name: gpt-4o-mini
94+
parameters:
95+
temperature: 0.7
96+
97+
edges:
98+
- from: START
99+
to: init_self_refine
100+
- from: init_self_refine
101+
to: generate_candidate
102+
- from: generate_candidate
103+
to: judge_candidate
104+
- from: judge_candidate
105+
to: update_trajectory
106+
- from: update_trajectory
107+
condition: sygra.recipes.self_refinement.task_executor.SelfRefinementLoopCondition
108+
path_map:
109+
accept: END
110+
refine: refine_candidate
111+
- from: refine_candidate
112+
to: judge_candidate

0 commit comments

Comments
 (0)