Skip to content

Commit 0e8dead

Browse files
committed
feat: add helper agents
a helper agent is intended to run (and not retry) and be one level lower than a step. This will allow for getting back large error outputs and streamlining to better get to an issue. Signed-off-by: vsoch <[email protected]>
1 parent d233656 commit 0e8dead

File tree

17 files changed

+276
-140
lines changed

17 files changed

+276
-140
lines changed

README.md

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,21 @@ This library is primarily being used for development for the descriptive thrust
1212
### Agents
1313

1414
The `fractale agent` command provides means to run build, job generation, and deployment agents.
15-
This part of the library is under development.
15+
This part of the library is under development. There are three kinds of agents:
16+
17+
- `step` agents are experts on doing specific tasks (do hold state)
18+
- `manager` agents know how to orchestrate step agents and choose between them (don't hold state, but could)
19+
- `helper` agents are used by step agents to do small tasks (e.g., suggest a fix for an error)
20+
21+
The design is simple in that each agent is responding to state of error vs. success. In the case of a step agent, the return code determines to continue or try again. In the case of a helper, the input is typically an erroneous response (or something that needs changing) with respect to a goal.
22+
For a manager, we are making a choice based on a previous erroneous step.
1623

1724
See [examples/agent](examples/agent) for an example.
1825

1926
#### To do items
2027

28+
- refactor manager to not handle prompt, just get step when retries come back.
29+
- then need to decide how to handle kubernetes job creating additional structures.
2130
- Get basic runner working
2231
- Add in ability to get log and optimize - the manager will need to use goal
2332
- We likely want the manager to be able to edit the prompt.
@@ -28,21 +37,29 @@ See [examples/agent](examples/agent) for an example.
2837

2938
**And experiment ideas**
3039

31-
- How do we define stability?
40+
- How do we define stability?
3241
- What are the increments of change (e.g., "adding a library")? We should be able to keep track of times for each stage and what changed, and an analyzer LLM can look at result and understand (categorize) most salient contributions to change.
3342
- We also can time the time it takes to do subsequent changes, when relevant. For example, if we are building, we should be able to use cached layers (and the build times speed up) if the LLM is changing content later in the Dockerfile.
3443
- We can also save the successful results (Dockerfile builds, for example) and compare for similarity. How consistent is the LLM?
3544
- How does specificity of the prompt influence the result?
3645
- For an experiment, we would want to do a build -> deploy and successful run for a series of apps and get distributions of attempts, reasons for failure, and a general sense of similarity / differences.
3746
- For the optimization experiment, we'd want to do the same, but understand gradients of change that led to improvement.
3847

39-
## Observations
48+
#### Observations
4049

4150
- Specifying cpu seems important - if you don't it wants to do GPU
4251
- If you ask for a specific example, it sometimes tries to download data (tell it where data is)
4352
- Always include common issues in the initial prompt
4453
- If you are too specific about instance types, it adds node selectors/affinity, and that often doesn't work.
4554

55+
#### Ideas
56+
57+
- The manager agent is currently generated an updated prompt AND choosing the step.
58+
- Arguably we should have a separation of responsibility so a step can ask to fix an error without a manager.
59+
- I think we need one more level of agent - a step agent should have helper agents that can:
60+
- take an error message and analyze to get a fix.
61+
62+
4663
### Job Specifications
4764

4865
#### Simple

examples/agent/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The build agent will use the Gemini API to generate a Dockerfile and then build
1010
Here is how to first ask the build agent to generate a lammps container for Google cloud.
1111

1212
```bash
13-
fractale agent build lammps --environment "google cloud" --outfile dockerfile
13+
fractale agent build lammps --environment "google cloud CPU" --outfile Dockerfile.lammps
1414
```
1515

1616
That might generate the [Dockerfile](Dockerfile) here, and a container that defaults to the application name "lammps"
@@ -27,7 +27,7 @@ kind load docker-image lammps
2727
To start, we will assume a kind cluster running and tell the agent the image is loaded into it (and so the pull policy will be never).
2828

2929
```bash
30-
fractale agent kubernetes-job lammps --environment "google cloud CPU" --context-file ./Dockerfile --no-pull
30+
fractale agent kubernetes-job lammps --environment "google cloud CPU" --context-file ./Dockerfile --no-pull
3131
```
3232

3333
## Manager

examples/agent/plans/run-lammps.yaml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@ plan:
2222
environment: "google cloud CPU instance in Kubernetes"
2323
max_attempts: 1
2424
details: |
25-
Please execute the reaxff HNS example, and assume the data in the PWD,
25+
Please execute the in.reaxff.hns example, and assume the data in the PWD,
2626
Run lammpss with params -v x 2 -v y 2 -v z 2 -in ./in.reaxff.hns
27-
and with -nocite flag for CPU. Do not try to generate configmap data.
28-
Do not add any nodeSelector or affinity rules since we are testing.
27+
and with -nocite flag for CPU.

fractale/agent/base.py

Lines changed: 44 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
1+
import os
2+
import sys
3+
4+
import google.generativeai as genai
5+
6+
import fractale.agent.defaults as defaults
17
import fractale.utils as utils
28

9+
310
class Agent:
411
"""
512
A base for an agent. Each agent should:
@@ -64,7 +71,7 @@ def write_file(self, context, content, add_comment=True):
6471
content += f"\n# Generated by fractale {self.name} agent"
6572
utils.write_file(content, outfile)
6673

67-
def get_code_block(self, content, code_type):
74+
def get_code_block(self, content, code_type):
6875
"""
6976
Parse a code block from the response
7077
"""
@@ -76,22 +83,13 @@ def get_code_block(self, content, code_type):
7683
content = content[: -len("```")]
7784
return content
7885

79-
def ask_gemini(self, prompt, with_history=True):
86+
def get_result(self, context):
8087
"""
81-
Ask gemini adds a wrapper with some error handling.
88+
Return either the entire context or single result.
8289
"""
83-
try:
84-
if with_history:
85-
response = self.chat.send_message(prompt)
86-
else:
87-
response = self.model.generate_content(prompt)
88-
89-
# This line can fail. If it succeeds, return entire response
90-
return response.text.strip()
91-
92-
except ValueError as e:
93-
print(f"[Error] The API response was blocked and contained no text: {str(e)}")
94-
return "GEMINI ERROR: The API returned an error (or stop) and we need to try again."
90+
if context.is_managed:
91+
return context
92+
return context.result
9593

9694
def run(self, context):
9795
"""
@@ -115,3 +113,34 @@ def get_prompt(self, context):
115113
"""
116114
assert context
117115
raise NotImplementedError(f"The {self.name} agent is missing a 'get_prompt' function")
116+
117+
118+
class GeminiAgent(Agent):
119+
"""
120+
A base for an agent that uses the Gemini API.
121+
"""
122+
123+
def init(self):
124+
self.model = genai.GenerativeModel(defaults.gemini_model)
125+
self.chat = self.model.start_chat()
126+
try:
127+
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
128+
except KeyError:
129+
sys.exit("ERROR: GEMINI_API_KEY environment variable not set.")
130+
131+
def ask_gemini(self, prompt, with_history=True):
132+
"""
133+
Ask gemini adds a wrapper with some error handling.
134+
"""
135+
try:
136+
if with_history:
137+
response = self.chat.send_message(prompt)
138+
else:
139+
response = self.model.generate_content(prompt)
140+
141+
# This line can fail. If it succeeds, return entire response
142+
return response.text.strip()
143+
144+
except ValueError as e:
145+
print(f"[Error] The API response was blocked and contained no text: {str(e)}")
146+
return "GEMINI ERROR: The API returned an error (or stop) and we need to try again."

fractale/agent/build/agent.py

Lines changed: 37 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
from fractale.agent.base import Agent
1+
from fractale.agent.base import GeminiAgent
22
import fractale.agent.build.prompts as prompts
33
from fractale.agent.context import get_context
4-
import fractale.agent.defaults as defaults
4+
from fractale.agent.errors import DebugAgent
55

66
import fractale.utils as utils
77
import argparse
@@ -18,29 +18,23 @@
1818
import subprocess
1919
import textwrap
2020

21-
import google.generativeai as genai
2221

2322
# regular expression in case LLM does not follow my instructions!
2423
dockerfile_pattern = r"```(?:dockerfile)?\n(.*?)```"
2524

2625

27-
class BuildAgent(Agent):
26+
class BuildAgent(GeminiAgent):
2827
"""
2928
Builder agent.
29+
30+
Observations from v:
31+
1. Holding the context (chat) seems to take longer.
32+
2. Don't forget to ask for CPU - GPU will take a lot longer.
3033
"""
3134

3235
name = "build"
3336
description = "builder agent"
3437

35-
def init(self):
36-
"""
37-
Custom initialization. I want to try using the same model
38-
agent across requests. I'm not sure if that means it's the same
39-
context (I don't think so).
40-
"""
41-
model = genai.GenerativeModel("gemini-2.5-pro")
42-
self.chat = model.start_chat()
43-
4438
def add_arguments(self, subparser):
4539
"""
4640
Add arguments for the plugin to show up in argparse
@@ -102,6 +96,8 @@ def run(self, context):
10296
1. Populate a context.
10397
2. Call supporting functions with the context.
10498
3. Parse the result and update context, taking appropriate action.
99+
4. The current object to generate should be put into result.
100+
5. The current issue or error goes into error_message.
105101
"""
106102
# Create or get global context
107103
context = get_context(context)
@@ -110,15 +106,14 @@ def run(self, context):
110106
# Start at 1 since we are showing to a user.
111107
self.attempts = self.attempts or 1
112108

113-
try:
114-
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
115-
except KeyError:
116-
sys.exit("ERROR: GEMINI_API_KEY environment variable not set.")
117-
118109
# This will either generate fresh or rebuild erroneous Dockerfile
119110
# We don't return the dockerfile because it is updated in the context
120111
self.generate_dockerfile(context)
121-
print(Panel(context.dockerfile, title="[green]Dockerfile or Response[/green]", border_style="green"))
112+
print(
113+
Panel(
114+
context.result, title="[green]Dockerfile or Response[/green]", border_style="green"
115+
)
116+
)
122117

123118
# Set the container on the context for a next step to use it...
124119
container = context.get("container") or self.generate_name(context.application)
@@ -127,50 +122,53 @@ def run(self, context):
127122
# Build it! We might want to only allow a certain number of retries or incremental changes.
128123
return_code, output = self.build(context)
129124
if return_code == 0:
125+
self.print_dockerfile(context.result)
130126
print(
131127
Panel(
132128
f"[bold green]✅ Build complete in {self.attempts} attempts[/bold green]",
133129
title="Success",
134130
border_style="green",
135131
)
136132
)
133+
137134
else:
138135
print(
139136
Panel(
140137
"[bold red]❌ Build failed[/bold red]", title="Build Status", border_style="red"
141138
)
142139
)
140+
# Ask the debug agent to better instruct the error message
141+
# This becomes a more guided output
142+
context.error_message = output
143+
agent = DebugAgent()
144+
# This updates the error message to be the output
145+
context = agent.run(context, requires=prompts.requires)
146+
147+
# TODO: test this idea extending to manager
148+
# manager should not be deciding what to do on failure,
149+
# but decidin what to do (step) AFTER reach limit
143150
# If we are returning a failure:
144151
# 1. Set context.return_code
145152
# 2. error message is the result
146-
if self.return_on_failure():
147-
context.return_code = -1
148-
context.result = output
149-
return self.get_result(context)
153+
# if self.return_on_failure():
154+
# context.return_code = -1
155+
# # TODO we should not have the manager parse error...
156+
# context.result = context.error_message
157+
# return self.get_result(context)
150158

151159
self.attempts += 1
152160
print("\n[bold cyan] Requesting Correction from Build Agent[/bold cyan]")
153161

154162
# Update the context with error message
155-
context.error_message = output
156163
return self.run(context)
157164

158165
# Add generation line
159-
self.write_file(context, context.dockerfile)
160-
self.print_dockerfile(context.dockerfile)
166+
self.write_file(context, context.result)
161167

162168
# Assume being called by a human that wants Dockerfile back,
163169
# unless we are being managed
164170
return self.get_result(context)
165171

166-
def get_result(self, context):
167-
"""
168-
Return either the entire context or single result.
169-
"""
170-
if context.is_managed:
171-
return context
172-
return context.dockerfile
173-
174172
def print_dockerfile(self, dockerfile):
175173
"""
176174
Print Dockerfile with highlighted Syntax
@@ -223,7 +221,7 @@ def build(self, context):
223221

224222
# Write the Dockerfile to the temporary directory
225223
utils.write_file(dockerfile, os.path.join(build_dir, "Dockerfile"))
226-
224+
227225
# If only one max attempt, don't print here, not important to show.
228226
if self.max_attempts is not None and self.max_attempts > 1:
229227
print(
@@ -252,15 +250,15 @@ def generate_dockerfile(self, context):
252250
"""
253251
prompt = self.get_prompt(context)
254252
print("Sending build prompt to Gemini...")
255-
print(textwrap.indent(prompt, "> ", predicate=lambda _: True))
253+
print(textwrap.indent(prompt[0:1000], "> ", predicate=lambda _: True))
256254

257255
# The API can error and not return a response.text.
258256
content = self.ask_gemini(prompt)
259257
print("Received Dockerfile response from Gemini...")
260258

261259
# Try to remove Dockerfile from code block
262260
try:
263-
content = self.get_code_block(content, 'dockerfile')
261+
content = self.get_code_block(content, "dockerfile")
264262

265263
# If we are getting commentary...
266264
match = re.search(dockerfile_pattern, content, re.DOTALL)
@@ -270,7 +268,8 @@ def generate_dockerfile(self, context):
270268
dockerfile = content.strip()
271269

272270
# The result is saved as a build step
273-
context.dockerfile = dockerfile
271+
# The dockerfile is the argument used internally
274272
context.result = dockerfile
273+
context.dockerfile = dockerfile
275274
except Exception as e:
276275
sys.exit(f"Error parsing response from Gemini: {e}\n{content}")

0 commit comments

Comments
 (0)