Skip to content

Claude computer use, by Claude Code#10

Closed
mbijon wants to merge 4 commits intomainfrom
dat-cc
Closed

Claude computer use, by Claude Code#10
mbijon wants to merge 4 commits intomainfrom
dat-cc

Conversation

@mbijon
Copy link
Owner

@mbijon mbijon commented Nov 4, 2025

Note

Introduces a Claude 4.5 computer-use agent to autonomously play Heatseeker, with full docs, tests, packaging, and Docker support; removes obsolete claude.md.

  • Agent Implementation
    • Add ComputerUseAgent in models/claude/src/agent.py using Anthropic Claude 4.5 + Playwright to screenshot, analyze, and act (screenshot, click, type, key, scroll) on the Heatseeker game.
    • Configurable constants (MODEL, COMPUTER_USE_BETA, DISPLAY_WIDTH/HEIGHT, max iterations) and CLI entry via python -m src.agent.
  • Documentation
    • Add detailed guide docs/computer_use/claude-4_5-api.md and models/claude/README.md with setup, config, API reference, troubleshooting, and Docker usage.
  • Packaging & Tooling
    • Add models/claude/pyproject.toml, pytest.ini, and Dockerfile for local/dev installs and containerized runs.
  • Testing
    • Add unit tests in models/claude/tests/ covering initialization, tool actions, loop flow, and config.
  • Cleanup
    • Remove contents of legacy claude.md.

Written by Cursor Bugbot for commit a64a985. This will update automatically on new commits. Configure here.

@vercel
Copy link

vercel bot commented Nov 4, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
heatseeker Ready Ready Preview Comment Nov 4, 2025 2:50am

cursor[bot]

This comment was marked as outdated.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +123 to +128
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page(
viewport={"width": DISPLAY_WIDTH, "height": DISPLAY_HEIGHT}
)
page.goto(url, wait_until="networkidle")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Maintain browser session across tool iterations

Every call to process_tool_use launches a new Playwright browser and page for the requested action (lines 123‑128) and then disposes them when the function returns. This means each click, key press, or scroll happens on a fresh instance of the game and is immediately discarded, so subsequent screenshots never reflect the effects of earlier tool actions. The agent therefore cannot make observable progress in the Heatseeker game. The tool handler should reuse a persistent browser/page across iterations and only capture new screenshots after actions have been applied.

Useful? React with 👍 / 👎.

Comment on lines +135 to +136
if action == "screenshot":
result = "Screenshot taken"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Screenshot tool result omits actual image data

When the model requests a screenshot action, the handler returns the literal string "Screenshot taken" (lines 135‑136) and this string is forwarded as the tool result. Claude’s computer‑use API expects the tool result for a screenshot to contain the captured image so the model can view the updated screen. Without providing that image payload, the model cannot see the game state after actions and the conversation will stall. The handler should capture the screenshot and return it as an image block in the tool result.

Useful? React with 👍 / 👎.

Comment on lines +171 to +172
browser.close()
page.wait_for_timeout(1000)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Call to Playwright API after browser shutdown

process_tool_use closes the browser and then calls page.wait_for_timeout(1000) on the same page object (lines 171‑172). After browser.close() the page’s channel is destroyed and Playwright raises a Target closed error when any method is invoked, so every tool invocation will fail. Delay should be implemented before closing the browser or via time.sleep outside Playwright.

Useful? React with 👍 / 👎.

@mbijon mbijon changed the title Dat cc Claude computer use, by Claude Code Nov 4, 2025
@mbijon mbijon closed this Nov 21, 2025
@mbijon mbijon deleted the dat-cc branch November 21, 2025 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant