You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**Compatibility**: Designed for various multimodal models.
16
16
-**Integration**: Currently integrated with **GPT-4v** as the default model.
17
17
-**Future Plans**: Support for additional models.
18
18
-**Accessibility**: Voice control thanks to [Whisper](https://github.com/mallorbc/whisper_mic) & [younesbram](https://github.com/younesbram)
19
19
20
20
21
-
###Current Challenges
21
+
## Current Challenges
22
22
> **Note:** GPT-4V's error rate in estimating XY mouse click locations is currently quite high. This framework aims to track the progress of multimodal models over time, aspiring to achieve human-level performance in computer operation.
23
23
24
-
###Ongoing Development
24
+
## Ongoing Development
25
25
At [HyperwriteAI](https://www.hyperwriteai.com/), we are developing Agent-1-Vision a multimodal model with more accurate click location predictions.
26
26
27
-
###Agent-1-Vision Model API Access
27
+
## Agent-1-Vision Model API Access
28
28
We will soon be offering API access to our Agent-1-Vision model.
29
29
30
30
If you're interested in gaining access to this API, sign up [here](https://othersideai.typeform.com/to/FszaJ1k8?typeform-source=www.hyperwriteai.com).
0 commit comments