Skip to content

Commit 7431709

Browse files
authored
Merge pull request #1318 from MikeBirdTech/update-vision
Update vision model to gpt-4o
2 parents c065618 + 39012e8 commit 7431709

File tree

6 files changed

+14
-14
lines changed

6 files changed

+14
-14
lines changed

docs/guides/profiles.mdx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,9 @@ from interpreter import interpreter
1818
interpreter.os = True
1919
interpreter.llm.supports_vision = True
2020

21-
interpreter.llm.model = "gpt-4-vision-preview"
21+
interpreter.llm.model = "gpt-4o"
2222

23-
interpreter.llm.supports_functions = False
23+
interpreter.llm.supports_functions = True
2424
interpreter.llm.context_window = 110000
2525
interpreter.llm.max_tokens = 4096
2626
interpreter.auto_run = True

docs/settings/all-settings.mdx

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -280,17 +280,17 @@ llm:
280280

281281
### Vision Mode
282282

283-
Enables vision mode, which adds some special instructions to the prompt and switches to `gpt-4-vision-preview`.
283+
Enables vision mode, which adds some special instructions to the prompt and switches to `gpt-4o`.
284284

285285
<CodeGroup>
286286
```bash Terminal
287287
interpreter --vision
288288
```
289289

290290
```python Python
291-
interpreter.llm.model = "gpt-4-vision-preview" # Any vision supporting model
291+
interpreter.llm.model = "gpt-4o" # Any vision supporting model
292292
interpreter.llm.supports_vision = True
293-
interpreter.llm.supports_functions = False # If model doesn't support functions, which is the case with gpt-4-vision.
293+
interpreter.llm.supports_functions = True
294294
295295
interpreter.custom_instructions = """The user will show you an image of the code you write. You can view images directly.
296296
For HTML: This will be run STATELESSLY. You may NEVER write '<!-- previous code here... --!>' or `<!-- header will go here -->` or anything like that. It is CRITICAL TO NEVER WRITE PLACEHOLDERS. Placeholders will BREAK it. You must write the FULL HTML CODE EVERY TIME. Therefore you cannot write HTML piecemeal—write all the HTML, CSS, and possibly Javascript **in one step, in one code block**. The user will help you review it visually.
@@ -302,10 +302,10 @@ If you use `plt.show()`, the resulting image will be sent to you. However, if yo
302302
loop: True
303303

304304
llm:
305-
model: "gpt-4-vision-preview"
305+
model: "gpt-4o"
306306
temperature: 0
307307
supports_vision: True
308-
supports_functions: False
308+
supports_functions: True
309309
context_window: 110000
310310
max_tokens: 4096
311311
custom_instructions: >

docs/usage/terminal/vision.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ To use vision (highly experimental), run the following command:
88
interpreter --vision
99
```
1010

11-
If a file path to an image is found in your input, it will be loaded into the vision model (`gpt-4-vision-preview` for now).
11+
If a file path to an image is found in your input, it will be loaded into the vision model (`gpt-4o` for now).

interpreter/terminal_interface/profiles/defaults/os.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@
66
interpreter.llm.supports_vision = True
77
# interpreter.shrink_images = True # Faster but less accurate
88

9-
interpreter.llm.model = "gpt-4-vision-preview"
9+
interpreter.llm.model = "gpt-4o"
1010

1111
interpreter.computer.import_computer_api = True
1212

13-
interpreter.llm.supports_functions = False
13+
interpreter.llm.supports_functions = True
1414
interpreter.llm.context_window = 110000
1515
interpreter.llm.max_tokens = 4096
1616
interpreter.auto_run = True

interpreter/terminal_interface/profiles/defaults/vision.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@
33
loop: True
44

55
llm:
6-
model: "gpt-4-vision-preview"
6+
model: "gpt-4o"
77
temperature: 0
88
supports_vision: True
9-
supports_functions: False
9+
supports_functions: True
1010
context_window: 110000
1111
max_tokens: 4096
1212
custom_instructions: >

tests/test_interpreter.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -662,9 +662,9 @@ def test_vision():
662662
]
663663

664664
interpreter.llm.supports_vision = True
665-
interpreter.llm.model = "gpt-4-vision-preview"
665+
interpreter.llm.model = "gpt-4o"
666666
interpreter.system_message += "\nThe user will show you an image of the code you write. You can view images directly.\n\nFor HTML: This will be run STATELESSLY. You may NEVER write '<!-- previous code here... --!>' or `<!-- header will go here -->` or anything like that. It is CRITICAL TO NEVER WRITE PLACEHOLDERS. Placeholders will BREAK it. You must write the FULL HTML CODE EVERY TIME. Therefore you cannot write HTML piecemeal—write all the HTML, CSS, and possibly Javascript **in one step, in one code block**. The user will help you review it visually.\nIf the user submits a filepath, you will also see the image. The filepath and user image will both be in the user's message.\n\nIf you use `plt.show()`, the resulting image will be sent to you. However, if you use `PIL.Image.show()`, the resulting image will NOT be sent to you."
667-
interpreter.llm.supports_functions = False
667+
interpreter.llm.supports_functions = True
668668
interpreter.llm.context_window = 110000
669669
interpreter.llm.max_tokens = 4096
670670
interpreter.loop = True

0 commit comments

Comments
 (0)