Skip to content

Commit fd8fd95

Browse files
authored
Added chain summarizer prompt
1 parent 048a622 commit fd8fd95

File tree

1 file changed

+71
-3
lines changed

1 file changed

+71
-3
lines changed

src/agentlab/analyze/error_analysis.py

Lines changed: 71 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,59 @@
11
from dataclasses import dataclass
22
from bgym import StepInfo
33

4+
CHANGE_SUMMARIZER_PROMPT = """
5+
You are a specialized 'change summarizer' model. At a given step in the agent's interaction with the website,
6+
you will receive the following pieces of information:
7+
8+
1. The user's MAIN GOAL (e.g., "Open a GitLab issue with label 'help wanted'").
9+
2. The AGENT'S PREVIOUS OBSERVATION (HTML or AX Tree snippet) or a 'DIFF' that shows what changed since the last step, and the corresponding change summaries.
10+
3. The AGENT'S CURRENT OBSERVATION (HTML or AX Tree snippet).
11+
4. The ACTION the agent just took (e.g., "Clicked the button labeled 'Show report'").
12+
5. (Optionally) The agent's CHAIN OF THOUGHT or short planning notes for this single step, if available.
13+
14+
YOUR TASK (each step):
15+
A) SUMMARIZE THE CHANGE
16+
- Describe what visibly changed between the previous observation (or diff) and the current observation.
17+
For example, did a new panel open, did the form reset, did nothing happen, etc.?
18+
19+
B) ASSESS THE ACTION
20+
- Decide whether the agent's action seems helpful or correct given the user's main goal,
21+
or if it appears incorrect/unhelpful.
22+
- Briefly explain why.
23+
24+
OUTPUT FORMAT (per step):
25+
Return your analysis as a JSON-like structure, for example:
26+
27+
{
28+
"changeSummary": "A new search results panel appeared on the right side.",
29+
"actionAssessment": "Correct",
30+
"explanation": "Clicking 'Search' was appropriate to display the results."
31+
}
32+
33+
Or for an incorrect action:
34+
35+
{
36+
"changeSummary": "The page reloaded but the date fields were reset to defaults.",
37+
"actionAssessment": "Incorrect",
38+
"explanation": "The agent should have fixed the date format first instead of re-clicking 'Show report'.",
39+
"suggestion": "Correct the date format or check for error messages."
40+
}
41+
42+
Please follow this structure at every step. Keep your responses concise and clear. Below are the details.
43+
44+
Goal: {goal}
45+
46+
LLM Plan: {plan}
47+
48+
Previous Observation: {past_observation}
49+
50+
Current Observation: {current_observation}
51+
52+
Past summaries: {past_summaries}
53+
54+
Action: {action}
55+
"""
56+
457

558
def _diff(past_obs, current_obs):
659
"""TODO: Implement the diff function.
@@ -23,14 +76,29 @@ def summarize(
2376
"""Produces, a summary of the effect of an action."""
2477
past_obs_message = self.obs_formatter(past_obs)
2578
current_obs_message = self.obs_formatter(current_obs)
79+
goal = past_obs["goal"]
80+
plan = past_obs["plan"]
2681
if self.use_diff:
2782
current_obs_message = _diff(past_obs_message, current_obs_message)
2883

29-
return self.llm(self.make_prompt(past_obs_message, current_obs_message, action))
84+
return self.llm(
85+
self.make_prompt(
86+
past_obs_message, action, current_obs_message, past_summaries, goal, plan
87+
)
88+
)
3089

31-
def make_prompt(self, past_obs_message, action, current_obs_message, past_summaries):
90+
def make_prompt(
91+
self, past_obs_message, action, current_obs_message, past_summaries, goal, plan
92+
):
3293
"""TODO: Implement the prompt."""
33-
return f"{past_obs_message} {action} {current_obs_message}"
94+
return CHANGE_SUMMARIZER_PROMPT.format(
95+
goal=goal,
96+
plan=plan,
97+
past_observation=past_obs_message,
98+
current_observation=current_obs_message,
99+
past_summaries=past_summaries,
100+
action=action,
101+
)
34102

35103

36104
@dataclass

0 commit comments

Comments
 (0)