ServiceNow · vipul-mittal · Jan 20, 2026 · Jan 7, 2026 · Jan 14, 2026 · Jan 14, 2026
@@ -0,0 +1,145 @@
+# Self Refinement
+
+> **Iterative candidate improvement with judge scores and a reflection trajectory (as a reusable subgraph)**
+
+## Overview
+
+Self Refinement is a workflow pattern where SyGra:
+
+- Generates a candidate output for a given input prompt
+- Uses a separate *judge* step to score and critique the candidate
+- Optionally refines the candidate using the critique
+- Repeats until the candidate is accepted (meets a score threshold) or a maximum number of iterations is reached
+
+This feature is implemented as a reusable **subgraph recipe**: `sygra.recipes.self_refinement`.
+
+---
+
+## What you get
+
+When used in a task, the self-refinement subgraph produces:
+
+- `candidate`: the final accepted (or last) candidate
+- `self_refine_judge`: judge output containing score, pass/fail, critique, and raw parsed output
+- `reflection_trajectory`: an ordered list capturing each iteration’s candidate + judge result (useful for training / analysis)
+
+### Example output shape
+
+```json
+{
+  "candidate": "...final candidate...",
+  "self_refine_judge": {
+    "score": 4,
+    "is_good": true,
+    "critique": "...",
+    "raw": {"score": 4, "is_good": true, "critique": "..."}
+  },
+  "reflection_trajectory": [
+    {
+      "iteration": 0,
+      "candidate_key": "candidate",
+      "candidate": "...candidate at iteration 0...",
+      "judge": {"score": 2, "is_good": false, "critique": "...", "raw": {"...": "..."}}
+    },
+    {
+      "iteration": 1,
+      "candidate_key": "candidate",
+      "candidate": "...candidate at iteration 1...",
+      "judge": {"score": 4, "is_good": true, "critique": "...", "raw": {"...": "..."}}
+    }
+  ]
+}
+```
+
+---
+
+## How it works (recipe)
+
+The recipe lives at:
+
+- `sygra/recipes/self_refinement/graph_config.yaml`
+- `sygra/recipes/self_refinement/task_executor.py`
+
+### Nodes
+
+- `init_self_refine` (lambda)
+  - Initializes internal loop state and `reflection_trajectory`
+  - Uses internal state keys prefixed with `_self_refine_*` to minimize collision risk in host graphs
+- `generate_candidate` (llm)
+  - Creates the initial candidate from `{prompt}`
+- `judge_candidate` (llm + post-process)
+  - Returns JSON with `{score, is_good, critique}`
+  - Post-processor normalizes and writes:
+    - `self_refine_judge`
+    - `self_refine_is_good`
+    - `self_refine_critique`
+- `update_trajectory` (lambda)
+  - Appends `{candidate, judge, iteration}` to `reflection_trajectory`
+- `refine_candidate` (llm)
+  - Produces an improved candidate using `{self_refine_critique}`
+
+### Looping behavior
+
+A conditional edge (`SelfRefinementLoopCondition`) ends the subgraph when either:
+
+- The judge accepts (`self_refine_is_good == True`), or
+- The maximum number of refinement iterations is reached
+
+---
+
+## Using self refinement as a subgraph in a task
+
+A typical pattern is to add a `subgraph` node in your task and point it at the recipe.
+
+The example task is located at:
+
+- `tasks/examples/self_refinement/graph_config.yaml`
+- `tasks/examples/self_refinement/input.json`
+
+### Example task config (high level)
+
+In `tasks/examples/self_refinement/graph_config.yaml`, the task defines a single node:
+
+- `self_refine`:
+  - `node_type: subgraph`
+  - `subgraph: sygra.recipes.self_refinement`
+  - Overrides recipe node config via `node_config_map` (e.g., model selection, temperatures, and loop parameters)
+
+It also maps the key outputs into the final dataset using `output_config.output_map`:
+
+- `candidate` from `candidate`
+- `self_refine_judge` from `self_refine_judge`
+- `reflection_trajectory` from `reflection_trajectory`
+
+---
+
+## Running the example
+
+From the repo root:
+
+```bash
+python main.py --task examples.self_refinement --num_records 2
+```
+
+Artifacts:
+
+- Output records are written under `tasks/examples/self_refinement/` (typically `output_<timestamp>.json`)
+- Metadata is written under `tasks/examples/self_refinement/metadata/` (typically `metadata_tasks_examples_self_refinement_<timestamp>.json`)
+
+The metadata file includes per-node execution statistics (e.g., `self_refine.generate_candidate`, `self_refine.judge_candidate`) and aggregate token/cost summaries.
+
+---
+
+## Customization
+
+You can tune behavior either in the recipe or (recommended) from the task via `node_config_map`:
+
+- `init_self_refine.params.max_iterations`
+- `init_self_refine.params.score_threshold`
+- `judge_candidate.model.parameters.temperature` (usually `0` for consistent scoring)
+- Prompts for `generate_candidate`, `judge_candidate`, and `refine_candidate`
+
+If you need the recipe to write outputs under different keys (to avoid collisions in a larger host graph), you can override:
+
+- `update_trajectory.params.candidate_key`
+- `update_trajectory.params.judge_key`
@@ -83,6 +83,7 @@ nav:
       - Structured Output: concepts/structured_output/README.md
   - Features:
       - Metadata Tracking: features/metadata_tracking.md
+      - Self Refinement: features/self_refinement.md
   - Tutorials:
       - Agent Simulation: tutorials/agent_simulation_tutorial.md
       - Agent Simulation with Tools: tutorials/agent_tool_simulation_tutorial.md

@@ -0,0 +1,112 @@
+graph_config:
+  nodes:
+    init_self_refine:
+      node_type: lambda
+      lambda: sygra.recipes.self_refinement.task_executor.InitSelfRefinementState
+      params:
+        max_iterations: 2
+        score_threshold: 4
+      output_keys:
+        - _self_refine_iteration
+        - _self_refine_max_iterations
+        - _self_refine_score_threshold
+        - reflection_trajectory
+
+    generate_candidate:
+      node_type: llm
+      output_keys: candidate
+      prompt:
+        - system: |
+            You are a helpful assistant.
+        - user: |
+            {prompt}
+      model:
+        name: gpt-4o-mini
+        parameters:
+          temperature: 0.7
+
+    judge_candidate:
+      node_type: llm
+      post_process: sygra.recipes.self_refinement.task_executor.SelfRefinementJudgePostProcessor
+      output_keys:
+        - self_refine_judge
+        - self_refine_is_good
+        - self_refine_critique
+      prompt:
+        - system: |
+            You are a strict judge. Evaluate the candidate against the user's request.
+            Return JSON only.
+        - user: |
+            User request:
+            {prompt}
+
+            Candidate:
+            {candidate}
+
+            Score the candidate from 1-5.
+            If score >= 4 it is good.
+
+            OUTPUT JSON:
+            {{"score": <int>, "is_good": <true/false>, "critique": "..."}}
+      model:
+        name: gpt-4o-mini
+        parameters:
+          temperature: 0
+        structured_output:
+          schema:
+            fields:
+              score:
+                type: float
+                description: "judge quality score of the candidate"
+              is_good:
+                type: bool
+                description: "Boolean value whether the candidate is good or bad"
+              critique:
+                type: str
+                description: "Judge's critique description of the candidate"
+
+    update_trajectory:
+      node_type: lambda
+      lambda: sygra.recipes.self_refinement.task_executor.UpdateSelfRefinementTrajectory
+      params:
+        candidate_key: candidate
+        judge_key: self_refine_judge
+
+    refine_candidate:
+      node_type: llm
+      output_keys: candidate
+      prompt:
+        - system: |
+            You improve the candidate based on critique. Return only the improved candidate.
+        - user: |
+            User request:
+            {prompt}
+
+            Previous candidate:
+            {candidate}
+
+            Critique:
+            {self_refine_critique}
+
+            Produce an improved candidate.
+      model:
+        name: gpt-4o-mini
+        parameters:
+          temperature: 0.7
+
+  edges:
+    - from: START
+      to: init_self_refine
+    - from: init_self_refine
+      to: generate_candidate
+    - from: generate_candidate
+      to: judge_candidate
+    - from: judge_candidate
+      to: update_trajectory
+    - from: update_trajectory
+      condition: sygra.recipes.self_refinement.task_executor.SelfRefinementLoopCondition
+      path_map:
+        accept: END
+        refine: refine_candidate
+    - from: refine_candidate
+      to: judge_candidate