You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/agent/README.md
+6Lines changed: 6 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -72,11 +72,17 @@ We haven't hit the case yet where the manager needs to take over - that needs fu
72
72
#### To do items
73
73
74
74
- Figure out optimization agent (with some goal)
75
+
- Right now when we restart, we do with fresh slate (no log memory) - should there be?
76
+
- We likely want some want to quantify the amount of change between prompts, and the difficulty of the task.
77
+
- I think likely when we return to the manager, we want the last response (that might say why it is returning) should inform step selection. But not just step selection, the updated prompt to the step missing something.
78
+
- Right now we rely on random sampling of the space to avoid whatever the issue might be.
75
79
76
80
#### Research Questions
77
81
78
82
**And experiment ideas**
79
83
84
+
- Why does it make the same mistakes? E.g., always forgetting ca-certificates. Did it learn from data that was OK to do and thus errors result from inconsistencies between the way things used to work and the way they do now?
85
+
- Insight: if I don't know how to run an app, it's unlikely the LLM can do it, because I can't give any guidance (and it guesses)
80
86
- How do we define stability?
81
87
- What are the increments of change (e.g., "adding a library")? We should be able to keep track of times for each stage and what changed, and an analyzer LLM can look at result and understand (categorize) most salient contributions to change.
82
88
- We also can time the time it takes to do subsequent changes, when relevant. For example, if we are building, we should be able to use cached layers (and the build times speed up) if the LLM is changing content later in the Dockerfile.
0 commit comments