You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/agentlab/analyze/error_analysis.py
+71-3Lines changed: 71 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,59 @@
1
1
fromdataclassesimportdataclass
2
2
frombgymimportStepInfo
3
3
4
+
CHANGE_SUMMARIZER_PROMPT="""
5
+
You are a specialized 'change summarizer' model. At a given step in the agent's interaction with the website,
6
+
you will receive the following pieces of information:
7
+
8
+
1. The user's MAIN GOAL (e.g., "Open a GitLab issue with label 'help wanted'").
9
+
2. The AGENT'S PREVIOUS OBSERVATION (HTML or AX Tree snippet) or a 'DIFF' that shows what changed since the last step, and the corresponding change summaries.
10
+
3. The AGENT'S CURRENT OBSERVATION (HTML or AX Tree snippet).
11
+
4. The ACTION the agent just took (e.g., "Clicked the button labeled 'Show report'").
12
+
5. (Optionally) The agent's CHAIN OF THOUGHT or short planning notes for this single step, if available.
13
+
14
+
YOUR TASK (each step):
15
+
A) SUMMARIZE THE CHANGE
16
+
- Describe what visibly changed between the previous observation (or diff) and the current observation.
17
+
For example, did a new panel open, did the form reset, did nothing happen, etc.?
18
+
19
+
B) ASSESS THE ACTION
20
+
- Decide whether the agent's action seems helpful or correct given the user's main goal,
21
+
or if it appears incorrect/unhelpful.
22
+
- Briefly explain why.
23
+
24
+
OUTPUT FORMAT (per step):
25
+
Return your analysis as a JSON-like structure, for example:
26
+
27
+
{
28
+
"changeSummary": "A new search results panel appeared on the right side.",
29
+
"actionAssessment": "Correct",
30
+
"explanation": "Clicking 'Search' was appropriate to display the results."
31
+
}
32
+
33
+
Or for an incorrect action:
34
+
35
+
{
36
+
"changeSummary": "The page reloaded but the date fields were reset to defaults.",
37
+
"actionAssessment": "Incorrect",
38
+
"explanation": "The agent should have fixed the date format first instead of re-clicking 'Show report'.",
39
+
"suggestion": "Correct the date format or check for error messages."
40
+
}
41
+
42
+
Please follow this structure at every step. Keep your responses concise and clear. Below are the details.
43
+
44
+
Goal: {goal}
45
+
46
+
LLM Plan: {plan}
47
+
48
+
Previous Observation: {past_observation}
49
+
50
+
Current Observation: {current_observation}
51
+
52
+
Past summaries: {past_summaries}
53
+
54
+
Action: {action}
55
+
"""
56
+
4
57
5
58
def_diff(past_obs, current_obs):
6
59
"""TODO: Implement the diff function.
@@ -23,14 +76,29 @@ def summarize(
23
76
"""Produces, a summary of the effect of an action."""
0 commit comments