Skip to content

Commit 2e26370

Browse files
authored
Merge pull request #8 from gaussian/develop
Update README with config format and CLI reference
2 parents 2537f34 + af4d54b commit 2e26370

File tree

1 file changed

+92
-5
lines changed

1 file changed

+92
-5
lines changed

README.md

Lines changed: 92 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,18 +31,41 @@ shots login --base-url https://your-app.example.com --out-dir shots_out
3131

3232
This writes `shots_out/storage_state.json`.
3333

34-
## 2) Run required screenshots from a config
34+
## 2) Run screenshots from a config
3535

3636
```bash
3737
export OPENAI_API_KEY=...
38-
shots run-config --config shots.yaml --out-dir shots_out --use-llm --use-llm-crop --save-source
38+
shots run-config --config shots.yaml --use-llm --use-llm-crop --save-source
3939
```
4040

41+
### CLI flags
42+
43+
| Flag | Default | Description |
44+
|------|---------|-------------|
45+
| `--config` | (required) | Path to YAML/JSON config file |
46+
| `--out-dir` | from config, or `shots_out` | Output directory (overrides config `out_dir`) |
47+
| `--use-llm` | off | LLM-driven multi-step navigation |
48+
| `--model` | `gpt-5.2` | OpenAI model for navigation/crop |
49+
| `--use-llm-crop` | off | LLM picks a marketing-friendly crop rectangle |
50+
| `--max-crop-retries` | `2` | Crop validation retry attempts |
51+
| `--save-source` | off | Save uncropped source images alongside output |
52+
| `--timeout-ms` | `10000` | Page-load / navigation timeout |
53+
| `--action-timeout-ms` | `5000` | Timeout for clicks/typing (fail fast) |
54+
| `--headed` | off | Show the browser window (debug) |
55+
| `--viewport` | `desktop` | Fallback viewport preset (`desktop`, `laptop`, `tablet`, `mobile`) |
56+
| `--viewport-w`, `--viewport-h`, `--scale` | from preset | Override viewport dimensions |
57+
| `--full-page` | from config/preset | Capture full scrollable page |
58+
4159
## Config format (YAML)
4260

61+
### Simple (flat `shots` list)
62+
63+
Each shot is auto-wrapped into its own group with `output: png`.
64+
4365
```yaml
4466
base_url: https://your-app.example.com
4567
start: /app
68+
out_dir: shots_out
4669

4770
defaults:
4871
viewport_preset: desktop
@@ -53,7 +76,7 @@ shots:
5376
- id: dashboard-hero
5477
description: >
5578
Capture the main dashboard with KPI cards and a chart visible.
56-
Navigate via the left nav if needed. Close any modal/tour/cookie overlay.
79+
Close any modal, cookie banner, or tour overlay.
5780
url: /app/dashboard
5881

5982
- id: integrations
@@ -62,8 +85,72 @@ shots:
6285
viewport_preset: laptop
6386
```
6487
88+
### Groups (multi-shot PDFs, labels, folders)
89+
90+
Use `groups` instead of `shots` for more control. Each group produces either a single PNG or a multi-page PDF.
91+
92+
```yaml
93+
base_url: https://your-app.example.com
94+
start: /app
95+
out_dir: shots_out
96+
97+
defaults:
98+
viewport_preset: desktop
99+
full_page: true
100+
max_nav_steps: 12
101+
102+
groups:
103+
- id: hero-shots
104+
output: png
105+
shots:
106+
- id: dashboard-hero
107+
description: >
108+
Capture the main dashboard.
109+
url: /app/dashboard
110+
111+
- id: onboarding-deck
112+
output: pdf
113+
folder: onboarding
114+
label: "{id} — {url}"
115+
label_date: true
116+
shots:
117+
- id: step-1-welcome
118+
description: Show the welcome screen.
119+
url: /app/onboarding
120+
- id: step-2-profile
121+
description: Show the profile setup page.
122+
url: /app/onboarding/profile
123+
```
124+
125+
### Config reference
126+
127+
| Field | Level | Description |
128+
|-------|-------|-------------|
129+
| `base_url` | top | (required) App base URL |
130+
| `start` | top | Default start path (default: `/`) |
131+
| `out_dir` | top | Output directory (default: `shots_out`, overridden by `--out-dir`) |
132+
| `defaults.viewport_preset` | top | `desktop` \| `laptop` \| `tablet` \| `mobile` |
133+
| `defaults.full_page` | top | Capture full scrollable page |
134+
| `defaults.max_nav_steps` | top | Max LLM navigation steps per shot (default: `12`) |
135+
| `shots` | top | Flat list of shots (cannot coexist with `groups`) |
136+
| `groups` | top | List of shot groups (cannot coexist with `shots`) |
137+
| `groups[].id` | group | (required) Group identifier |
138+
| `groups[].output` | group | `png` (1 shot max) or `pdf` (multi-shot) |
139+
| `groups[].folder` | group | Override output subfolder name (defaults to group id) |
140+
| `groups[].label` | group | Label template applied to all shots (`{id}`, `{url}`, `{title}`) |
141+
| `groups[].label_date` | group | Append UTC timestamp below label |
142+
| `id` | shot | (required) Shot identifier |
143+
| `description` | shot | (required) What to capture (used as LLM goal) |
144+
| `url` | shot | Start URL for this shot (absolute or relative to `base_url`) |
145+
| `viewport_preset` | shot | Override viewport for this shot |
146+
| `viewport` | shot | Custom `{width, height, scale}` |
147+
| `full_page` | shot | Override full-page capture for this shot |
148+
| `label` | shot | Per-shot label override |
149+
65150
## Notes
66151

67-
* `--use-llm` enables multi-step "acquire the shot" behavior: the model returns one action at a time until it says `done`.
68-
* `--use-llm-crop` asks the model to choose a crop rectangle for a marketing-friendly framing.
152+
* `--use-llm` enables multi-step navigation: the model sees a screenshot + accessibility tree each step and returns one action at a time until it says `done`.
153+
* `--use-llm-crop` asks the model to choose a crop rectangle for marketing-friendly framing, with automatic validation and retry.
69154
* All navigation is kept **same-origin** as `base_url`.
155+
* A detailed log is written to `<out_dir>/shots.log` on every run.
156+
* If the LLM repeats a previously failed action, it is automatically re-queried with a different approach.

0 commit comments

Comments
 (0)