Skip to content

Commit 4539e0c

Browse files
authored
Merge pull request #19 from AugmentedCamel/patch-1
typo :)
2 parents 5f2a023 + ff753a3 commit 4539e0c

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ QuestCameraKit is a collection of template and reference projects demonstrating
4848
## 5. 🧠 OpenAI vision model
4949

5050
- **Purpose:** Ask OpenAI's vision model (or any other multi-modal LLM) for context of your current scene.
51-
- **Description:** We use a the OpenAI Speech to text API to create a coommand. We then send this command together with a screenshot to the Vision model. Lastly, we get the response back and use the Text to speech API to turn the response text into an audio file in Unity to speak the response. The user can select different speakers, models, and speed. For the command we can add additional instructions for the model, as well as select an image, image & text, or just a text mode. The whole loop takes anywhere from `2-6 seconds`, depending on the internet connection.
51+
- **Description:** We use a the OpenAI Speech to text API to create a command. We then send this command together with a screenshot to the Vision model. Lastly, we get the response back and use the Text to speech API to turn the response text into an audio file in Unity to speak the response. The user can select different speakers, models, and speed. For the command we can add additional instructions for the model, as well as select an image, image & text, or just a text mode. The whole loop takes anywhere from `2-6 seconds`, depending on the internet connection.
5252

5353
https://github.com/user-attachments/assets/a4cfbfc2-0306-40dc-a9a3-cdccffa7afea
5454

0 commit comments

Comments
 (0)