-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add RALPH-loop recipes to Copilot SDK cookbook #696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
aaronpowell
merged 13 commits into
github:main
from
tonybaloney:cookbook/ralph-loop-recipe
Feb 11, 2026
+1,405
−2
Merged
Changes from 12 commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
d8fc473
Add RALPH-loop recipe to Copilot SDK cookbook
tonybaloney 7e39d55
Apply suggestions from code review
tonybaloney bb9f63a
Update README to remove RALPH-loop reference
tonybaloney ab82acc
Address review feedback: fix event handler leak, error handling, mode…
tonybaloney 952372c
Rewrite Ralph loop recipes: split into simple vs ideal versions
tonybaloney 92df16d
Remove git commands from Ralph loop recipes
tonybaloney 1074e34
Add SDK features to all Ralph loop recipes
tonybaloney 0e61670
Remove package-lock.json from tracking
tonybaloney 3eb7efc
Use gpt-5.1-codex-mini as default model in Ralph loop recipes
tonybaloney 3b4d601
Remove package-lock.json from tracking
tonybaloney 84486c2
Apply suggestions from code review
tonybaloney ff69b80
renaming
tonybaloney 717c012
PR feedback
tonybaloney File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,260 @@ | ||
| # Ralph Loop: Autonomous AI Task Loops | ||
|
|
||
| Build autonomous coding loops where an AI agent picks tasks, implements them, validates against backpressure (tests, builds), commits, and repeats — each iteration in a fresh context window. | ||
|
|
||
| > **Runnable example:** [recipe/ralph-loop.cs](recipe/ralph-loop.cs) | ||
| > | ||
| > ```bash | ||
| > cd dotnet | ||
| > dotnet run recipe/ralph-loop.cs | ||
| > ``` | ||
|
|
||
| ## What is a Ralph Loop? | ||
|
|
||
| A [Ralph loop](https://ghuntley.com/ralph/) is an autonomous development workflow where an AI agent iterates through tasks in isolated context windows. The key insight: **state lives on disk, not in the model's context**. Each iteration starts fresh, reads the current state from files, does one task, writes results back to disk, and exits. | ||
|
|
||
| ``` | ||
| ┌─────────────────────────────────────────────────┐ | ||
| │ loop.sh │ | ||
| │ while true: │ | ||
| │ ┌─────────────────────────────────────────┐ │ | ||
| │ │ Fresh session (isolated context) │ │ | ||
| │ │ │ │ | ||
| │ │ 1. Read PROMPT.md + AGENTS.md │ │ | ||
| │ │ 2. Study specs/* and code │ │ | ||
| │ │ 3. Pick next task from plan │ │ | ||
| │ │ 4. Implement + run tests │ │ | ||
| │ │ 5. Update plan, commit, exit │ │ | ||
| │ └─────────────────────────────────────────┘ │ | ||
| │ ↻ next iteration (fresh context) │ | ||
| └─────────────────────────────────────────────────┘ | ||
| ``` | ||
|
|
||
| **Core principles:** | ||
|
|
||
| - **Fresh context per iteration**: Each loop creates a new session — no context accumulation, always in the "smart zone" | ||
| - **Disk as shared state**: `IMPLEMENTATION_PLAN.md` persists between iterations and acts as the coordination mechanism | ||
| - **Backpressure steers quality**: Tests, builds, and lints reject bad work — the agent must fix issues before committing | ||
| - **Two modes**: PLANNING (gap analysis → generate plan) and BUILDING (implement from plan) | ||
|
|
||
| ## Simple Version | ||
|
|
||
| The minimal Ralph loop — the SDK equivalent of `while :; do cat PROMPT.md | copilot ; done`: | ||
|
|
||
| ```csharp | ||
| using GitHub.Copilot.SDK; | ||
|
|
||
| var client = new CopilotClient(); | ||
| await client.StartAsync(); | ||
|
|
||
| try | ||
| { | ||
| var prompt = await File.ReadAllTextAsync("PROMPT.md"); | ||
| var maxIterations = 50; | ||
|
|
||
| for (var i = 1; i <= maxIterations; i++) | ||
| { | ||
| Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); | ||
|
|
||
| // Fresh session each iteration — context isolation is the point | ||
| var session = await client.CreateSessionAsync( | ||
| new SessionConfig { Model = "gpt-5.1-codex-mini" }); | ||
| try | ||
| { | ||
| var done = new TaskCompletionSource<string>(); | ||
| session.On(evt => | ||
| { | ||
| if (evt is AssistantMessageEvent msg) | ||
| done.TrySetResult(msg.Data.Content); | ||
| }); | ||
|
|
||
| await session.SendAsync(new MessageOptions { Prompt = prompt }); | ||
| await done.Task; | ||
| } | ||
| finally | ||
| { | ||
| await session.DisposeAsync(); | ||
| } | ||
|
|
||
| Console.WriteLine($"Iteration {i} complete."); | ||
| } | ||
| } | ||
| finally | ||
| { | ||
| await client.StopAsync(); | ||
| } | ||
| ``` | ||
|
|
||
| This is all you need to get started. The prompt file tells the agent what to do; the agent reads project files, does work, commits, and exits. The loop restarts with a clean slate. | ||
|
|
||
| ## Ideal Version | ||
|
|
||
| The full Ralph pattern with planning and building modes, matching the [Ralph Playbook](https://github.com/ClaytonFarr/ralph-playbook) architecture: | ||
|
|
||
| ```csharp | ||
| using GitHub.Copilot.SDK; | ||
|
|
||
| // Parse args: dotnet run [plan] [max_iterations] | ||
| var mode = args.Contains("plan") ? "plan" : "build"; | ||
| var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _)); | ||
| var maxIterations = maxArg != null ? int.Parse(maxArg) : 50; | ||
| var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; | ||
|
|
||
| var client = new CopilotClient(); | ||
| await client.StartAsync(); | ||
|
|
||
| Console.WriteLine(new string('━', 40)); | ||
| Console.WriteLine($"Mode: {mode}"); | ||
| Console.WriteLine($"Prompt: {promptFile}"); | ||
| Console.WriteLine($"Max: {maxIterations} iterations"); | ||
| Console.WriteLine(new string('━', 40)); | ||
|
|
||
| try | ||
| { | ||
| var prompt = await File.ReadAllTextAsync(promptFile); | ||
|
|
||
| for (var i = 1; i <= maxIterations; i++) | ||
| { | ||
| Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); | ||
|
|
||
| // Fresh session — each task gets full context budget | ||
| var session = await client.CreateSessionAsync( | ||
| new SessionConfig | ||
| { | ||
| Model = "gpt-5.1-codex-mini", | ||
| // Pin the agent to the project directory | ||
| WorkingDirectory = Environment.CurrentDirectory, | ||
| // Auto-approve tool calls for unattended operation | ||
| OnPermissionRequest = (_, _) => Task.FromResult( | ||
| new PermissionRequestResult { Kind = "approved" }), | ||
| }); | ||
| try | ||
| { | ||
| var done = new TaskCompletionSource<string>(); | ||
| session.On(evt => | ||
| { | ||
| // Log tool usage for visibility | ||
| if (evt is ToolExecutionStartEvent toolStart) | ||
| Console.WriteLine($" ⚙ {toolStart.Data.ToolName}"); | ||
| else if (evt is AssistantMessageEvent msg) | ||
| done.TrySetResult(msg.Data.Content); | ||
| }); | ||
|
|
||
| await session.SendAsync(new MessageOptions { Prompt = prompt }); | ||
| await done.Task; | ||
| } | ||
| finally | ||
| { | ||
| await session.DisposeAsync(); | ||
| } | ||
|
|
||
| Console.WriteLine($"\nIteration {i} complete."); | ||
| } | ||
|
|
||
| Console.WriteLine($"\nReached max iterations: {maxIterations}"); | ||
| } | ||
| finally | ||
| { | ||
| await client.StopAsync(); | ||
| } | ||
| ``` | ||
|
|
||
| ### Required Project Files | ||
|
|
||
| The ideal version expects this file structure in your project: | ||
|
|
||
| ``` | ||
| project-root/ | ||
| ├── PROMPT_plan.md # Planning mode instructions | ||
| ├── PROMPT_build.md # Building mode instructions | ||
| ├── AGENTS.md # Operational guide (build/test commands) | ||
| ├── IMPLEMENTATION_PLAN.md # Task list (generated by planning mode) | ||
| ├── specs/ # Requirement specs (one per topic) | ||
| │ ├── auth.md | ||
| │ └── data-pipeline.md | ||
| └── src/ # Your source code | ||
| ``` | ||
|
|
||
| ### Example `PROMPT_plan.md` | ||
|
|
||
| ```markdown | ||
| 0a. Study `specs/*` to learn the application specifications. | ||
| 0b. Study IMPLEMENTATION_PLAN.md (if present) to understand the plan so far. | ||
| 0c. Study `src/` to understand existing code and shared utilities. | ||
|
|
||
| 1. Compare specs against code (gap analysis). Create or update | ||
| IMPLEMENTATION_PLAN.md as a prioritized bullet-point list of tasks | ||
| yet to be implemented. Do NOT implement anything. | ||
|
|
||
| IMPORTANT: Do NOT assume functionality is missing — search the | ||
| codebase first to confirm. Prefer updating existing utilities over | ||
| creating ad-hoc copies. | ||
| ``` | ||
|
|
||
| ### Example `PROMPT_build.md` | ||
|
|
||
| ```markdown | ||
| 0a. Study `specs/*` to learn the application specifications. | ||
| 0b. Study IMPLEMENTATION_PLAN.md. | ||
| 0c. Study `src/` for reference. | ||
|
|
||
| 1. Choose the most important item from IMPLEMENTATION_PLAN.md. Before | ||
| making changes, search the codebase (don't assume not implemented). | ||
| 2. After implementing, run the tests. If functionality is missing, add it. | ||
| 3. When you discover issues, update IMPLEMENTATION_PLAN.md immediately. | ||
| 4. When tests pass, update IMPLEMENTATION_PLAN.md, then `git add -A` | ||
| then `git commit` with a descriptive message. | ||
|
|
||
| 5. When authoring documentation, capture the why. | ||
| 6. Implement completely. No placeholders or stubs. | ||
| 7. Keep IMPLEMENTATION_PLAN.md current — future iterations depend on it. | ||
| ``` | ||
|
|
||
| ### Example `AGENTS.md` | ||
|
|
||
| Keep this brief (~60 lines). It's loaded every iteration, so bloat wastes context. | ||
|
|
||
| ```markdown | ||
| ## Build & Run | ||
|
|
||
| dotnet build | ||
|
|
||
| ## Validation | ||
|
|
||
| - Tests: `dotnet test` | ||
| - Build: `dotnet build --no-restore` | ||
| ``` | ||
|
|
||
| ## Best Practices | ||
|
|
||
| 1. **Fresh context per iteration**: Never accumulate context across iterations — that's the whole point | ||
| 2. **Disk is your database**: `IMPLEMENTATION_PLAN.md` is shared state between isolated sessions | ||
| 3. **Backpressure is essential**: Tests, builds, lints in `AGENTS.md` — the agent must pass them before committing | ||
| 4. **Start with PLANNING mode**: Generate the plan first, then switch to BUILDING | ||
| 5. **Observe and tune**: Watch early iterations, add guardrails to prompts when the agent fails in specific ways | ||
| 6. **The plan is disposable**: If the agent goes off track, delete `IMPLEMENTATION_PLAN.md` and re-plan | ||
| 7. **Keep `AGENTS.md` brief**: It's loaded every iteration — operational info only, no progress notes | ||
| 8. **Use a sandbox**: The agent runs autonomously with full tool access — isolate it | ||
| 9. **Set `WorkingDirectory`**: Pin the session to your project root so tool operations resolve paths correctly | ||
| 10. **Auto-approve permissions**: Use `OnPermissionRequest` to allow tool calls without interrupting the loop | ||
|
|
||
| ## When to Use a Ralph Loop | ||
|
|
||
| **Good for:** | ||
|
|
||
| - Implementing features from specs with test-driven validation | ||
| - Large refactors broken into many small tasks | ||
| - Unattended, long-running development with clear requirements | ||
| - Any work where backpressure (tests/builds) can verify correctness | ||
|
|
||
| **Not good for:** | ||
|
|
||
| - Tasks requiring human judgment mid-loop | ||
| - One-shot operations that don't benefit from iteration | ||
| - Vague requirements without testable acceptance criteria | ||
| - Exploratory prototyping where direction isn't clear | ||
|
|
||
| ## See Also | ||
|
|
||
| - [Error Handling](error-handling.md) — timeout patterns and graceful shutdown for long-running sessions | ||
| - [Persisting Sessions](persisting-sessions.md) — save and resume sessions across restarts | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| #:package GitHub.Copilot.SDK@* | ||
| #:property PublishAot=false | ||
tonybaloney marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| using GitHub.Copilot.SDK; | ||
|
|
||
| // Ralph loop: autonomous AI task loop with fresh context per iteration. | ||
| // | ||
| // Two modes: | ||
| // - "plan": reads PROMPT_plan.md, generates/updates IMPLEMENTATION_PLAN.md | ||
| // - "build": reads PROMPT_build.md, implements tasks, runs tests, commits | ||
| // | ||
| // Each iteration creates a fresh session so the agent always operates in | ||
| // the "smart zone" of its context window. State is shared between | ||
| // iterations via files on disk (IMPLEMENTATION_PLAN.md, AGENTS.md, specs/*). | ||
| // | ||
| // Usage: | ||
| // dotnet run # build mode, 50 iterations | ||
| // dotnet run plan # planning mode | ||
| // dotnet run 20 # build mode, 20 iterations | ||
| // dotnet run plan 5 # planning mode, 5 iterations | ||
|
|
||
| var mode = args.Contains("plan") ? "plan" : "build"; | ||
| var maxArg = args.FirstOrDefault(a => int.TryParse(a, out _)); | ||
| var maxIterations = maxArg != null ? int.Parse(maxArg) : 50; | ||
| var promptFile = mode == "plan" ? "PROMPT_plan.md" : "PROMPT_build.md"; | ||
|
|
||
| var client = new CopilotClient(); | ||
| await client.StartAsync(); | ||
|
|
||
| Console.WriteLine(new string('━', 40)); | ||
| Console.WriteLine($"Mode: {mode}"); | ||
| Console.WriteLine($"Prompt: {promptFile}"); | ||
| Console.WriteLine($"Max: {maxIterations} iterations"); | ||
| Console.WriteLine(new string('━', 40)); | ||
|
|
||
| try | ||
| { | ||
| var prompt = await File.ReadAllTextAsync(promptFile); | ||
|
|
||
| for (var i = 1; i <= maxIterations; i++) | ||
| { | ||
| Console.WriteLine($"\n=== Iteration {i}/{maxIterations} ==="); | ||
|
|
||
| // Fresh session — each task gets full context budget | ||
| var session = await client.CreateSessionAsync( | ||
| new SessionConfig | ||
| { | ||
| Model = "gpt-5.1-codex-mini", | ||
| // Pin the agent to the project directory | ||
| WorkingDirectory = Environment.CurrentDirectory, | ||
| // Auto-approve tool calls for unattended operation | ||
| OnPermissionRequest = (_, _) => Task.FromResult( | ||
| new PermissionRequestResult { Kind = "approved" }), | ||
| }); | ||
|
|
||
| try | ||
| { | ||
| var done = new TaskCompletionSource<string>(); | ||
| session.On(evt => | ||
| { | ||
| // Log tool usage for visibility | ||
| if (evt is ToolExecutionStartEvent toolStart) | ||
| Console.WriteLine($" ⚙ {toolStart.Data.ToolName}"); | ||
| else if (evt is AssistantMessageEvent msg) | ||
| done.TrySetResult(msg.Data.Content); | ||
| }); | ||
|
|
||
| await session.SendAsync(new MessageOptions { Prompt = prompt }); | ||
| await done.Task; | ||
| } | ||
| finally | ||
| { | ||
| await session.DisposeAsync(); | ||
| } | ||
|
|
||
| Console.WriteLine($"\nIteration {i} complete."); | ||
| } | ||
|
|
||
| Console.WriteLine($"\nReached max iterations: {maxIterations}"); | ||
| } | ||
| finally | ||
| { | ||
| await client.StopAsync(); | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.