Releases: CursorTouch/Web-Use
Releases · CursorTouch/Web-Use
v0.2
v0.1
Key Features & Updates
- Dual Agent Modes: Supports both non-vision and vision-based agent operation (to support both LLM and VLM).
- Scrollable vs. Interactive Elements: A clear separation improves DOM recognition and interaction.
- Scrolling Logic: Enables scrolling through distinct webpage sections, including nested containers.
- HTML → Markdown: Upgraded to
markdownifyin theScrape Toolfor better content conversion. - Tab Management: Tracks the number of open tabs, active tab, and supports basic tab control.
- Extensible Tools: Add custom tools to the agent via the
additional_toolsparameter. - Iframe & Shadow DOM Access: Enhanced ability to interact with embedded or encapsulated elements.
- Structured Output: Returns well-defined BaseModel outputs using the
structured_outputparameter. - Human-in-the-Loop: Add manual checkpoints in the workflow via the
include_human_in_loopparameter (thanks @tanmaysk001!) - Inference Wrapper: Fixed the bug in the
open routerimplementation (thanks @thecoderwithHat) - Navigation Fixes: Improved handling of edge-case navigations across complex sites.