Releases: AmrDab/clawdcursor
v0.7.5 — Provider-Agnostic Computer Use, App Guides, Adaptive Learning
What's New in v0.7.5
🧠 Provider-Agnostic Refactor
Eliminated all hardcoded model/provider checks. Uses declarative capability flags (reasoningVisionModel,
computerUse, openaiCompat). Works with any provider out of the box.
📚 App Guide System
Community-contributed JSON instruction manuals for 86+ apps. Install with clawdcursor guides install excel.
Teaches the AI keyboard shortcuts, workflows, and UI layout. Loaded automatically at runtime.
🔁 Adaptive Learning
Successful tasks save their action sequences to app guide JSON. Next time the same app is used, Stage 2 reads the
learned workflow and executes natively — no vision needed. Gets smarter with every interaction.
⚡ 3-Stage Pipeline
Stage 1 (deterministic, free) → Stage 2 (text LLM, cheap) → Stage 3 (vision LLM, expensive). Most tasks complete at
Stage 1–2.
🤝 returnPartial Mode
External agents (OpenClaw, Claude Code) send {"returnPartial": true}. If Stage 2 fails, control returns to the
calling agent instead of burning tokens on Stage 3.
🔑 Per-Layer API Keys
Mixed-provider pipelines (e.g. Kimi text + Anthropic vision) use separate API keys per layer.
🌐 New Provider: Google Gemini 2.5 Flash
Auto-detected from GEMINI_API_KEY or GOOGLE_API_KEY. Budget-friendly at ~$0.15/1M tokens.
🛠 Other
- 42 tools (added
minimize_window,smart_read) - 172 tests pass
- CDP auto-init in serve mode
smart_click10s timeoutfocus_windowphantom cleanup- Spatial layout analysis for text-only LLMs
- Memory leak fixes, Linux GPU detection, PID file locking
Full docs: https://clawdcursor.com
Quick start: git clone https://github.com/AmrDab/clawdcursor && npm install && npm run setup