Skip to content

Add structured logging, action timeout, and LLM retry dedup#7

Merged
kut merged 9 commits intomainfrom
develop
Feb 23, 2026
Merged

Add structured logging, action timeout, and LLM retry dedup#7
kut merged 9 commits intomainfrom
develop

Conversation

@kut
Copy link
Copy Markdown
Member

@kut kut commented Feb 23, 2026

Summary

  • Add per-module structured logging with file (shots.log) + stderr output
  • Separate action_timeout_ms (clicks/typing, fail fast) from navigation timeout
  • Detect and re-query when LLM repeats a previously failed action
  • Feed failure history into LLM prompt to prevent loops
  • Tighten stability wait caps for faster runs
  • Update default model to gpt-5.2

Test plan

  • Run shots run with --use-llm and verify shots.log is created with debug output
  • Confirm --action-timeout-ms flag works and element interactions fail fast
  • Trigger an LLM action failure and verify the retry dedup kicks in (check logs)
  • Verify stability waits are noticeably faster on server-rendered pages

🤖 Generated with Claude Code

Kut Akdogan and others added 9 commits February 19, 2026 11:26
- Add hatchling build-system and tool.uv.package so the `shots` entry
  point installs correctly
- Move shots.yaml to shots.yaml.example and gitignore the real config
- Add .env.example for OPENAI_API_KEY
- Support non-interactive login via Playwright inspector (page.pause)
  when stdin is not a TTY
- Update README with setup instructions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add per-module logging with file + stderr output (shots.log)
- Separate action_timeout_ms (clicks/typing, fail fast) from nav timeout
- Detect and re-query when LLM repeats a previously failed action
- Feed failure history into LLM prompt to prevent loops
- Tighten stability wait caps for faster runs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Stop downscaling screenshots to 1000px before sending to the vision
model — the squished text caused garbled OCR (wrong emails, wrong URLs).
Add accessibility tree + link href extraction so the LLM sees actual
clickable elements instead of guessing from pixels. Also add PAGE
ELEMENTS rule to system prompt and reduce history window to last 5 steps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass shot description to crop LLM so it knows what content to include.
Add 8% padding around crop coordinates to prevent clipping field labels
and content at edges.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the 8% padding hack with a self-correcting crop validation loop:
after cropping, send the result back to the LLM to verify all content is
visible. On failure, re-crop with the rejection reason (max 2 retries).

Restructure output into per-shot folders: {out_dir}/{shot_id}/output.png
with sources/ and previous/ subdirectories. Add out_dir and per-shot
folder override to shots.yaml config.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Support grouping screenshots in config with per-group output mode
(png/pdf). Labels render as white banners below screenshots with
template variables ({url}, {id}, {title}) and URL desensitization.
Backward-compatible: flat shots lists auto-wrap into single-shot groups.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add label_date boolean to group config for appending UTC timestamp.
Draw black separator line between screenshot and label banner.
Increase padding and line spacing in label for better readability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kut kut merged commit 8f2e4e1 into main Feb 23, 2026
0 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant