|
16 | 16 | You have 4 possible operation actions available to you. The `pyautogui` library will be used to execute your decision. Your output will be used in a `json.loads` loads statement. |
17 | 17 |
|
18 | 18 | 1. click - Move mouse and click |
| 19 | +``` |
19 | 20 | [{{ "thought": "write a thought here", "operation": "click", "x": "x percent (e.g. 0.10)", "y": "y percent (e.g. 0.13)" }}] # "percent" refers to the percentage of the screen's dimensions in decimal format |
| 21 | +``` |
20 | 22 |
|
21 | 23 | 2. write - Write with your keyboard |
| 24 | +``` |
22 | 25 | [{{ "thought": "write a thought here", "operation": "write", "content": "text to write here" }}] |
| 26 | +``` |
23 | 27 |
|
24 | 28 | 3. press - Use a hotkey or press key to operate the computer |
| 29 | +``` |
25 | 30 | [{{ "thought": "write a thought here", "operation": "press", "keys": ["keys to use"] }}] |
| 31 | +``` |
26 | 32 |
|
27 | 33 | 4. done - The objective is completed |
| 34 | +``` |
28 | 35 | [{{ "thought": "write a thought here", "operation": "done", "summary": "summary of what was completed" }}] |
| 36 | +``` |
29 | 37 |
|
30 | 38 | Return the actions in array format `[]`. You can take just one action or multiple actions. |
31 | 39 |
|
32 | 40 | Here are some helpful combinations: |
33 | 41 |
|
34 | 42 | # Opens Spotlight Search on Mac |
| 43 | +``` |
35 | 44 | [ |
36 | 45 | {{ "thought": "Searching the operating system to find Google Chrome because it appears I am currently in terminal", "operation": "press", "keys": {os_search_str} }}, |
37 | 46 | {{ "thought": "Now I need to write 'Google Chrome' as a next step", "operation": "write", "content": "Google Chrome" }}, |
38 | 47 | {{ "thought": "Finally I'll press enter to open Google Chrome assuming it is available", "operation": "press", "keys": ["enter"] }} |
39 | 48 | ] |
| 49 | +``` |
40 | 50 |
|
41 | 51 | # Focuses on the address bar in a browser before typing a website |
| 52 | +``` |
42 | 53 | [ |
43 | 54 | {{ "thought": "I'll focus on the address bar in the browser. I can see the browser is open so this should be safe to try", "operation": "press", "keys": [{cmd_string}, "l"] }}, |
44 | 55 | {{ "thought": "Now that the address bar is in focus I can type the URL", "operation": "write", "content": "https://news.ycombinator.com/" }}, |
45 | 56 | {{ "thought": "I'll need to press enter to go the URL now", "operation": "press", "keys": ["enter"] }} |
46 | 57 | ] |
| 58 | +``` |
47 | 59 |
|
48 | 60 | A few important notes: |
49 | 61 |
|
|
62 | 74 | You have 4 possible operation actions available to you. The `pyautogui` library will be used to execute your decision. Your output will be used in a `json.loads` loads statement. |
63 | 75 |
|
64 | 76 | 1. click - Move mouse and click - We labeled the clickable elements with red bounding boxes and IDs. Label IDs are in the following format with `x` being a number: `~x` |
| 77 | +``` |
65 | 78 | [{{ "thought": "write a thought here", "operation": "click", "label": "~x" }}] # 'percent' refers to the percentage of the screen's dimensions in decimal format |
66 | | -
|
| 79 | +``` |
67 | 80 | 2. write - Write with your keyboard |
| 81 | +``` |
68 | 82 | [{{ "thought": "write a thought here", "operation": "write", "content": "text to write here" }}] |
69 | | -
|
| 83 | +``` |
70 | 84 | 3. press - Use a hotkey or press key to operate the computer |
| 85 | +``` |
71 | 86 | [{{ "thought": "write a thought here", "operation": "press", "keys": ["keys to use"] }}] |
| 87 | +``` |
72 | 88 |
|
73 | 89 | 4. done - The objective is completed |
| 90 | +``` |
74 | 91 | [{{ "thought": "write a thought here", "operation": "done", "summary": "summary of what was completed" }}] |
75 | | -
|
| 92 | +``` |
76 | 93 | Return the actions in array format `[]`. You can take just one action or multiple actions. |
77 | 94 |
|
78 | 95 | Here are some helpful combinations: |
79 | 96 |
|
80 | 97 | # Opens Spotlight Search on Mac |
| 98 | +``` |
81 | 99 | [ |
82 | 100 | {{ "thought": "Searching the operating system to find Google Chrome because it appears I am currently in terminal", "operation": "press", "keys": {os_search_str} }}, |
83 | 101 | {{ "thought": "Now I need to write 'Google Chrome' as a next step", "operation": "write", "content": "Google Chrome" }}, |
84 | 102 | ] |
| 103 | +``` |
85 | 104 |
|
86 | 105 | # Focuses on the address bar in a browser before typing a website |
| 106 | +``` |
87 | 107 | [ |
88 | 108 | {{ "thought": "I'll focus on the address bar in the browser. I can see the browser is open so this should be safe to try", "operation": "press", "keys": [{cmd_string}, "l"] }}, |
89 | 109 | {{ "thought": "Now that the address bar is in focus I can type the URL", "operation": "write", "content": "https://news.ycombinator.com/" }}, |
90 | 110 | {{ "thought": "I'll need to press enter to go the URL now", "operation": "press", "keys": ["enter"] }} |
91 | 111 | ] |
| 112 | +``` |
92 | 113 |
|
93 | 114 | # Send a "Hello World" message in the chat |
| 115 | +``` |
94 | 116 | [ |
95 | 117 | {{ "thought": "I see a messsage field on this page near the button. It looks like it has a label", "operation": "click", "label": "~34" }}, |
96 | 118 | {{ "thought": "Now that I am focused on the message field, I'll go ahead and write ", "operation": "write", "content": "Hello World" }}, |
97 | 119 | ] |
| 120 | +``` |
98 | 121 |
|
99 | 122 | A few important notes: |
100 | 123 |
|
@@ -226,7 +249,7 @@ def get_system_prompt(model, objective): |
226 | 249 | # Optional verbose output |
227 | 250 | if config.verbose: |
228 | 251 | print("[get_system_prompt] model:", model) |
229 | | - print("[get_system_prompt] prompt:", prompt) |
| 252 | + # print("[get_system_prompt] prompt:", prompt) |
230 | 253 |
|
231 | 254 | return prompt |
232 | 255 |
|
|
0 commit comments