Skip to content

Releases: AmrDab/clawdcursor

v0.7.5 — Provider-Agnostic Computer Use, App Guides, Adaptive Learning

06 Apr 04:04

Choose a tag to compare

What's New in v0.7.5

🧠 Provider-Agnostic Refactor

Eliminated all hardcoded model/provider checks. Uses declarative capability flags (reasoningVisionModel,
computerUse, openaiCompat). Works with any provider out of the box.

📚 App Guide System

Community-contributed JSON instruction manuals for 86+ apps. Install with clawdcursor guides install excel.
Teaches the AI keyboard shortcuts, workflows, and UI layout. Loaded automatically at runtime.

🔁 Adaptive Learning

Successful tasks save their action sequences to app guide JSON. Next time the same app is used, Stage 2 reads the
learned workflow and executes natively — no vision needed. Gets smarter with every interaction.

⚡ 3-Stage Pipeline

Stage 1 (deterministic, free) → Stage 2 (text LLM, cheap) → Stage 3 (vision LLM, expensive). Most tasks complete at
Stage 1–2.

🤝 returnPartial Mode

External agents (OpenClaw, Claude Code) send {"returnPartial": true}. If Stage 2 fails, control returns to the
calling agent instead of burning tokens on Stage 3.

🔑 Per-Layer API Keys

Mixed-provider pipelines (e.g. Kimi text + Anthropic vision) use separate API keys per layer.

🌐 New Provider: Google Gemini 2.5 Flash

Auto-detected from GEMINI_API_KEY or GOOGLE_API_KEY. Budget-friendly at ~$0.15/1M tokens.

🛠 Other

  • 42 tools (added minimize_window, smart_read)
  • 172 tests pass
  • CDP auto-init in serve mode
  • smart_click 10s timeout
  • focus_window phantom cleanup
  • Spatial layout analysis for text-only LLMs
  • Memory leak fixes, Linux GPU detection, PID file locking

Full docs: https://clawdcursor.com
Quick start: git clone https://github.com/AmrDab/clawdcursor && npm install && npm run setup