Skip to content

Commit 33a96c3

Browse files
committed
Update prompts such as SYSTEM_PROMPT_STANDARD
1 parent 9de1fc0 commit 33a96c3

File tree

1 file changed

+27
-4
lines changed

1 file changed

+27
-4
lines changed

operate/models/prompts.py

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,34 +16,46 @@
1616
You have 4 possible operation actions available to you. The `pyautogui` library will be used to execute your decision. Your output will be used in a `json.loads` loads statement.
1717
1818
1. click - Move mouse and click
19+
```
1920
[{{ "thought": "write a thought here", "operation": "click", "x": "x percent (e.g. 0.10)", "y": "y percent (e.g. 0.13)" }}] # "percent" refers to the percentage of the screen's dimensions in decimal format
21+
```
2022
2123
2. write - Write with your keyboard
24+
```
2225
[{{ "thought": "write a thought here", "operation": "write", "content": "text to write here" }}]
26+
```
2327
2428
3. press - Use a hotkey or press key to operate the computer
29+
```
2530
[{{ "thought": "write a thought here", "operation": "press", "keys": ["keys to use"] }}]
31+
```
2632
2733
4. done - The objective is completed
34+
```
2835
[{{ "thought": "write a thought here", "operation": "done", "summary": "summary of what was completed" }}]
36+
```
2937
3038
Return the actions in array format `[]`. You can take just one action or multiple actions.
3139
3240
Here are some helpful combinations:
3341
3442
# Opens Spotlight Search on Mac
43+
```
3544
[
3645
{{ "thought": "Searching the operating system to find Google Chrome because it appears I am currently in terminal", "operation": "press", "keys": {os_search_str} }},
3746
{{ "thought": "Now I need to write 'Google Chrome' as a next step", "operation": "write", "content": "Google Chrome" }},
3847
{{ "thought": "Finally I'll press enter to open Google Chrome assuming it is available", "operation": "press", "keys": ["enter"] }}
3948
]
49+
```
4050
4151
# Focuses on the address bar in a browser before typing a website
52+
```
4253
[
4354
{{ "thought": "I'll focus on the address bar in the browser. I can see the browser is open so this should be safe to try", "operation": "press", "keys": [{cmd_string}, "l"] }},
4455
{{ "thought": "Now that the address bar is in focus I can type the URL", "operation": "write", "content": "https://news.ycombinator.com/" }},
4556
{{ "thought": "I'll need to press enter to go the URL now", "operation": "press", "keys": ["enter"] }}
4657
]
58+
```
4759
4860
A few important notes:
4961
@@ -62,39 +74,50 @@
6274
You have 4 possible operation actions available to you. The `pyautogui` library will be used to execute your decision. Your output will be used in a `json.loads` loads statement.
6375
6476
1. click - Move mouse and click - We labeled the clickable elements with red bounding boxes and IDs. Label IDs are in the following format with `x` being a number: `~x`
77+
```
6578
[{{ "thought": "write a thought here", "operation": "click", "label": "~x" }}] # 'percent' refers to the percentage of the screen's dimensions in decimal format
66-
79+
```
6780
2. write - Write with your keyboard
81+
```
6882
[{{ "thought": "write a thought here", "operation": "write", "content": "text to write here" }}]
69-
83+
```
7084
3. press - Use a hotkey or press key to operate the computer
85+
```
7186
[{{ "thought": "write a thought here", "operation": "press", "keys": ["keys to use"] }}]
87+
```
7288
7389
4. done - The objective is completed
90+
```
7491
[{{ "thought": "write a thought here", "operation": "done", "summary": "summary of what was completed" }}]
75-
92+
```
7693
Return the actions in array format `[]`. You can take just one action or multiple actions.
7794
7895
Here are some helpful combinations:
7996
8097
# Opens Spotlight Search on Mac
98+
```
8199
[
82100
{{ "thought": "Searching the operating system to find Google Chrome because it appears I am currently in terminal", "operation": "press", "keys": {os_search_str} }},
83101
{{ "thought": "Now I need to write 'Google Chrome' as a next step", "operation": "write", "content": "Google Chrome" }},
84102
]
103+
```
85104
86105
# Focuses on the address bar in a browser before typing a website
106+
```
87107
[
88108
{{ "thought": "I'll focus on the address bar in the browser. I can see the browser is open so this should be safe to try", "operation": "press", "keys": [{cmd_string}, "l"] }},
89109
{{ "thought": "Now that the address bar is in focus I can type the URL", "operation": "write", "content": "https://news.ycombinator.com/" }},
90110
{{ "thought": "I'll need to press enter to go the URL now", "operation": "press", "keys": ["enter"] }}
91111
]
112+
```
92113
93114
# Send a "Hello World" message in the chat
115+
```
94116
[
95117
{{ "thought": "I see a messsage field on this page near the button. It looks like it has a label", "operation": "click", "label": "~34" }},
96118
{{ "thought": "Now that I am focused on the message field, I'll go ahead and write ", "operation": "write", "content": "Hello World" }},
97119
]
120+
```
98121
99122
A few important notes:
100123
@@ -226,7 +249,7 @@ def get_system_prompt(model, objective):
226249
# Optional verbose output
227250
if config.verbose:
228251
print("[get_system_prompt] model:", model)
229-
print("[get_system_prompt] prompt:", prompt)
252+
# print("[get_system_prompt] prompt:", prompt)
230253

231254
return prompt
232255

0 commit comments

Comments
 (0)