Skip to content

Commit a01d2ab

Browse files
committed
skills.browser-cli: export via legacyPackages
1 parent 458043f commit a01d2ab

File tree

16 files changed

+4099
-0
lines changed

16 files changed

+4099
-0
lines changed

flake.nix

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -171,6 +171,7 @@
171171
./5pkgs/flake-module.nix
172172
./keys/flake-module.nix
173173
./wrapperModules/flake-module.nix
174+
./skills/flake-module.nix
174175
]
175176
++ (
176177
# Auto-import all flake-module.nix files from tools subdirectories

skills/browser-cli/README.md

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
# Browser CLI
2+
3+
A command-line interface for controlling Firefox through WebExtensions API.
4+
Optimized for LLM agents with limited context windows.
5+
6+
## Overview
7+
8+
Browser CLI consists of three components:
9+
10+
1. **Firefox Extension** - Executes commands in the browser and provides visual
11+
feedback
12+
2. **Native Messaging Bridge** - Facilitates communication between the CLI and
13+
extension
14+
3. **CLI Client** - Minimal command-line tool that executes JavaScript via stdin
15+
16+
## Installation
17+
18+
### For Nix Users
19+
20+
```bash
21+
nix run github:lassulus/superconfig#skills.browser-cli -- --help
22+
```
23+
24+
### Manual Installation
25+
26+
1. **Install the Firefox Extension**
27+
- Open Firefox
28+
- Navigate to `about:debugging`
29+
- Click "This Firefox"
30+
- Click "Load Temporary Add-on"
31+
- Select `manifest.json` from the `extension` directory
32+
33+
2. **Install Native Messaging Host**
34+
```bash
35+
browser-cli --install-host
36+
```
37+
38+
## Usage
39+
40+
See [SKILL.md](SKILL.md) for usage examples and JavaScript API reference.
41+
42+
## Architecture
43+
44+
```
45+
┌─────────────┐ Unix Socket ┌──────────────┐ Native ┌────────────┐
46+
│ CLI │ ◄─────────────────► │ Bridge │ ◄─────────────► │ Extension │
47+
│ (stdin) │ │ Server │ Messaging │ │
48+
└─────────────┘ └──────────────┘ └────────────┘
49+
```
50+
51+
## Development
52+
53+
### Project Structure
54+
55+
```
56+
browser-cli/
57+
├── extension/ # Firefox WebExtension
58+
│ ├── manifest.json
59+
│ ├── background.js # Extension service worker
60+
│ └── content.js # Page automation and JS API
61+
├── browser_cli/ # Python CLI package
62+
│ ├── cli.py # CLI entry point
63+
│ ├── client.py # Unix socket client
64+
│ ├── bridge.py # Native messaging bridge
65+
│ └── server.py # Bridge server
66+
└── pyproject.toml
67+
```
68+
69+
### Building
70+
71+
For Nix users:
72+
73+
```bash
74+
nix build .#browser-cli
75+
```
76+
77+
## Troubleshooting
78+
79+
### Extension Not Connecting
80+
81+
1. Ensure Firefox is running
82+
2. The Browser CLI extension is installed
83+
3. Native messaging host is installed: `browser-cli --install-host`
84+
4. Check Firefox console for errors: `Ctrl+Shift+J`
85+
86+
### Commands Timing Out
87+
88+
- Use `wait()` for dynamic content: `await wait("text", "Loaded")`
89+
- Check element refs are current: `snap()` to refresh
90+
91+
### Stale Refs
92+
93+
Refs are reset on each snapshot. If you get "Element [N] not found", call
94+
`snap()` to get fresh refs.
95+
96+
## License
97+
98+
MIT

skills/browser-cli/SKILL.md

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
---
2+
name: browser-cli
3+
description: Control Firefox browser from the command line. Use for web automation, scraping, testing, or any browser interaction tasks.
4+
---
5+
6+
# Usage
7+
8+
```bash
9+
# List managed tabs
10+
browser-cli --list
11+
12+
# Open a page and get snapshot
13+
browser-cli <<'EOF'
14+
await tab("https://example.com")
15+
snap()
16+
EOF
17+
18+
# Execute in a specific tab
19+
browser-cli abc123 <<'EOF'
20+
snap()
21+
EOF
22+
```
23+
24+
# JavaScript API
25+
26+
All functions are available in the execution context. Actions return simple
27+
confirmations; use `snap()` to get page state.
28+
29+
## Element Interaction (use refs from snap())
30+
31+
```bash
32+
browser-cli <<'EOF'
33+
await click(1) // Click element [1]
34+
await click(1, {double: true}) // Double click
35+
await type(2, "user@example.com") // Type into element [2]
36+
await type(2, "new", {clear: true}) // Clear first, then type
37+
await hover(3) // Hover over element [3]
38+
await drag(4, 5) // Drag from [4] to [5]
39+
await select(6, "option-value") // Select dropdown option
40+
key("Enter") // Press key
41+
key("Tab")
42+
EOF
43+
44+
# Can still use CSS selectors when needed
45+
browser-cli <<'EOF'
46+
await click("#submit-button")
47+
await click("Sign In", "text")
48+
EOF
49+
```
50+
51+
## Page Inspection
52+
53+
```bash
54+
browser-cli <<'EOF'
55+
snap() // Get page snapshot with refs
56+
snap({forms: true}) // Only form elements
57+
snap({links: true}) // Only links
58+
snap({buttons: true}) // Only buttons
59+
snap({text: "login"}) // Elements containing "login"
60+
diff() // Show changes since last snap
61+
logs() // Get console logs
62+
EOF
63+
```
64+
65+
## Waiting
66+
67+
```bash
68+
browser-cli <<'EOF'
69+
await wait(1000) // Wait 1 second
70+
await wait("idle") // Wait for DOM to stabilize
71+
await wait("text", "Success") // Wait for text to appear
72+
await wait("gone", "Loading") // Wait for text to disappear
73+
EOF
74+
```
75+
76+
## Downloads
77+
78+
```bash
79+
browser-cli TABID <<'EOF'
80+
const url = document.querySelector('a[href*="pdf"]').href
81+
await download(url, "invoice.pdf") // Downloads to ~/Downloads/invoice.pdf
82+
EOF
83+
```
84+
85+
## Screenshots & Tabs
86+
87+
```bash
88+
browser-cli <<'EOF'
89+
await shot() // Screenshot, returns data URL
90+
await shot("/tmp/page.png") // Screenshot to file
91+
await tab() // New tab
92+
await tab("https://example.com") // New tab with URL
93+
await tabs() // List all tabs
94+
EOF
95+
```
96+
97+
# Snapshot Output Format
98+
99+
```
100+
Page: Example Site
101+
URL: https://example.com
102+
103+
[1] heading "Welcome"
104+
[2] input[email] "Email" [required]
105+
[3] input[password] "Password" [required]
106+
[4] checkbox "Remember me"
107+
[5] button "Sign In"
108+
[6] link "Forgot password?"
109+
```
110+
111+
- `[N]` - Reference number for use with click(), type(), etc.
112+
- Role and accessible name shown
113+
- Attributes in brackets: `[disabled]`, `[checked]`, `[required]`, etc.
114+
115+
# Examples
116+
117+
```bash
118+
# Search on Google
119+
browser-cli <<'EOF'
120+
await tab("https://google.com")
121+
snap()
122+
EOF
123+
# Output shows [12] combobox "Suche"
124+
125+
browser-cli <<'EOF'
126+
await type(12, "hello world")
127+
diff()
128+
EOF
129+
# Shows: Added (autocomplete options), Changed (input value)
130+
131+
# Form filling
132+
browser-cli <<'EOF'
133+
await tab("https://example.com/login")
134+
snap()
135+
EOF
136+
# Output: [1] input "Email", [2] input "Password", [3] button "Sign In"
137+
138+
browser-cli <<'EOF'
139+
await type(1, "user@test.com")
140+
await type(2, "secret123")
141+
await click(3)
142+
diff()
143+
EOF
144+
145+
# Wait for dynamic content
146+
browser-cli <<'EOF'
147+
await click(5)
148+
await wait("text", "Success")
149+
snap()
150+
EOF
151+
```
152+
153+
See [README.md](README.md) for installation, architecture, and full API reference.
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"""Browser CLI - Control Firefox from the command line.
2+
3+
Provides a minimal CLI that executes JavaScript via stdin, with a rich
4+
JS API for browser automation.
5+
"""
6+
7+
from browser_cli.bridge import NativeMessagingBridge
8+
from browser_cli.cli import main
9+
from browser_cli.client import BrowserClient
10+
11+
__all__ = ["BrowserClient", "NativeMessagingBridge", "main"]

0 commit comments

Comments
 (0)