Skip to content

Commit 4a9535b

Browse files
committed
docs: expand web-ctl section to match ecosystem detail level
Add architecture flow, session lifecycle, full action table, click-wait explanation, error code table, security summary, and selector syntax - matching the depth of other plugin sections.
1 parent 79c3810 commit 4a9535b

File tree

1 file changed

+51
-14
lines changed

1 file changed

+51
-14
lines changed

README.md

Lines changed: 51 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -692,27 +692,64 @@ agent-knowledge/
692692

693693
### /web-ctl
694694

695-
**Purpose:** Browser automation for AI agents navigate, authenticate, and interact with web pages.
695+
**Purpose:** Browser automation for AI agents - navigate, authenticate, and interact with web pages.
696696

697-
**What it does:**
697+
**How it works:**
698+
699+
Each invocation is a single Node.js process using Playwright. No daemon, no MCP server. Session state persists via Chrome's userDataDir with AES-256-GCM encrypted storage.
700+
701+
```
702+
Agent calls skill -> node scripts/web-ctl.js <args> -> Playwright API -> JSON result
703+
```
704+
705+
**Session lifecycle:**
706+
707+
1. `session start <name>` - Create session (encrypted profile directory)
708+
2. `session auth <name> --url <login-url>` - Opens headed Chrome for human login (2FA, CAPTCHAs). Polls for success URL/selector, encrypts cookies on completion
709+
3. `run <name> <action>` - Headless actions using persisted cookies
710+
4. `session end <name>` - Cleanup
711+
712+
**Actions:**
698713

699-
1. **Session Management** - Persistent browser sessions with AES-256-GCM encrypted storage
700-
2. **Auth Handoff** - Opens a headed browser for the user to complete login; agent monitors for success
701-
3. **Headless Browsing** - Navigate, click, click-wait (SPA-friendly), type, fill, read, screenshot, capture network, evaluate JS
702-
4. **CAPTCHA/Checkpoint** - Escalates to headed browser for human verification
703-
5. **Actionable Errors** - Classified error codes with recovery suggestions instead of raw Playwright messages
704-
6. **Prompt Injection Defense** - All web content wrapped in `[PAGE_CONTENT: ...]` delimiters; agent treats it as untrusted data
714+
| Action | Description | Key flag |
715+
|--------|-------------|----------|
716+
| `goto <url>` | Navigate to URL | |
717+
| `snapshot` | Get accessibility tree (primary page inspection) | |
718+
| `click <sel>` | Click element | `--wait-stable` |
719+
| `click-wait <sel>` | Click and wait for DOM + network stability | `--timeout <ms>` |
720+
| `type <sel> <text>` | Type with human-like delays | |
721+
| `read <sel>` | Read element text content | |
722+
| `fill <sel> <value>` | Clear field and set value | |
723+
| `wait <sel>` | Wait for element to appear | `--timeout <ms>` |
724+
| `evaluate <js>` | Execute JS in page context | `--allow-evaluate` |
725+
| `screenshot` | Full-page screenshot | `--path <file>` |
726+
| `network` | Capture network requests | `--filter <pattern>` |
727+
| `checkpoint` | Open headed browser for user (CAPTCHAs) | `--timeout <sec>` |
728+
729+
`click-wait` waits for network idle + no DOM mutations for 500ms before returning. Cuts SPA interactions from multiple agent turns to one.
730+
731+
**Error handling:**
732+
733+
All errors return classified codes with actionable recovery suggestions:
734+
735+
| Code | Recovery suggestion |
736+
|------|-------------------|
737+
| `element_not_found` | Snapshot included in response for selector discovery |
738+
| `timeout` | Increase `--timeout` |
739+
| `browser_closed` | `session start <name>` |
740+
| `network_error` | Check URL; verify cookies with `session status` |
741+
| `no_display` | Use `--vnc` flag |
742+
| `session_expired` | Re-authenticate |
743+
744+
**Security:** Output sanitization (cookies/tokens redacted), prompt injection defense (`[PAGE_CONTENT: ...]` delimiters), AES-256-GCM encryption at rest, anti-bot measures (`webdriver=false`, random delays), read-only agent (no Write/Edit tools).
745+
746+
**Selector syntax:** `role=button[name='Submit']`, `css=div.class`, `text=Click here`, `#id`
705747

706748
**Usage:**
707749

708750
```bash
709-
# Navigate to a URL
710751
/web-ctl goto https://example.com
711-
712-
# Authenticate to a site (opens browser for you to log in)
713752
/web-ctl auth twitter --url https://x.com/i/flow/login
714-
715-
# Multi-step browsing
716753
/web-ctl # describe what you want to do, agent orchestrates it
717754
```
718755

@@ -726,7 +763,7 @@ npx playwright install chromium
726763

727764
**Agent:** web-session (sonnet model)
728765

729-
**Skills:** web-auth, web-browse
766+
**Skills:** web-auth (human-in-the-loop auth), web-browse (headless actions)
730767

731768
---
732769

0 commit comments

Comments
 (0)