Skip to content

Commit 5420aaa

Browse files
committed
docs: Various unsaved edits
1 parent be75c21 commit 5420aaa

File tree

4 files changed

+25
-17
lines changed

4 files changed

+25
-17
lines changed

docs/agent-browsers/spec.md

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@
44

55
Defines a shared, cross‑platform convention for storing named browser profiles used by automated agents that require authenticated access to particular websites. A profile represents a persistent browser user data directory plus lightweight metadata that describes login expectations and provenance. This spec’s primary purpose is to make such profiles discoverable by applications while allowing users to transparently know which profile and authentication will be used by the application. The same profile name can be referenced by multiple applications. A default profile is used when none is specified.
66

7-
#### Motivating example
7+
#### Motivation
88

9-
Multiple agentic applications (e.g., a research assistant, an issue triager, and an expense reporter) need to act on behalf of the user across several websites (e.g., `chatgpt.com`, `jira.example.com`, `expense.example.com`). Instead of each app asking the user to log in separately, they discover existing agent browser profiles by matching `loginExpectations` (site `id`/`origins`) and reuse the corresponding user data directories. Typically these apps run headless using a browser automation framework such as Playwright. When an expected login is not actually present, the app restarts the automation engine in a visible state so the user can complete the login, then resumes and finishes the task.
9+
Multiple agentic applications (e.g., a research assistant, an issue triager, and an expense reporter) need to act on behalf of the user across several websites (e.g., `chatgpt.com`, `jira.example.com`, `expense.example.com`). Instead of each app asking the user to log in separately, they discover existing agent browser profiles by matching the sites/username metadata that each profile provides. Typically these apps run headless using a browser automation framework such as Playwright. When an expected login is not actually acomplished, the app restarts the automation engine in a visible state so the user can complete the login, then resumes and finishes the task.
1010

1111
If the app discovers multiple candidate profiles for the same website (for example, different `username` values), our guidance is to ask the user which profile to use for the current task. Applications should communicate profile names clearly and expose options to create new profiles or rename existing ones. Users are expected to become familiar with these profile names, which are reused across applications.
1212

@@ -58,7 +58,6 @@ Format: JSON, UTF‑8. Unknown fields must be ignored for forward compatibility.
5858
"createdBy": ["my-app", "v1.2.3"],
5959
"loginExpectations": [
6060
{
61-
"id": "chatgpt-com",
6261
"origins": ["https://chatgpt.com"],
6362
"username": "[email protected]"
6463
}
@@ -73,17 +72,10 @@ Field definitions:
7372
- `createdAt` / `updatedAt` (RFC3339 strings): For auditing.
7473
- `createdBy` (array<string>): Application and version that created this profile, e.g., `["app-name", "v1.2.3"]`.
7574
- `loginExpectations` (array): Zero or more per‑site discovery hints. Each entry:
76-
- `id` (string): Stable identifier for the site (e.g., `chatgpt-com`).
7775
- `origins` (array<string>): Allowed origins for the site (schemes required).
7876
- `username` (string): Account identifier expected to be logged in (email, handle, or user ID).
7977
Applications MAY include additional, application‑specific keys inside `loginExpectations` entries to support their own check mechanisms; such keys are not standardized by this spec.
8078

81-
Semantics:
82-
- Applications MAY add engine‑specific data under `browsers/*` and MUST NOT modify fields they do not own.
83-
- This spec does not define a login‑check format. Applications and libraries are expected to implement authentication checks in an application‑specific way and may publish reusable packages for popular sites.
84-
- Recommended (non‑normative) UX guidance: start headless; if a check indicates login is required, relaunch the same user data directory headful to allow the user to complete login, then continue the task.
85-
- Discoverability intent: when an application needs to act on a site (e.g., `chatgpt.com`), it can search for profiles with matching `loginExpectations.id`/`origins`. If multiple profiles exist with different `username` values, the application may select automatically per policy or prompt the user to choose which account to use for the task.
86-
8779
### Environment Variables
8880

8981
- `AGENT_BROWSER_PROFILES_DIR`: Absolute path override for the base directory.
@@ -93,6 +85,5 @@ Semantics:
9385

9486
- Profile contents may include cookies and tokens protected by OS keychains. Profiles generally do not port across different machines/OSes. Treat them as per‑user, per‑machine.
9587
- Never commit profile directories to source control.
96-
- Prefer role/aria selectors in `selector-present` checks to minimize locale‑specific fragility.
9788

9889

docs/browser-automation/README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,5 +12,3 @@ Each document in this folder describes an automation targeting a specific site t
1212
- Use Playwright persistent contexts bound to a selected profile.
1313
- Prefer headless execution when the profile’s login expectations are met; otherwise, switch to headful and guide the user.
1414
- Detect UI drift and fail fast with actionable diagnostics. When possible, surface the browser window to help the user investigate.
15-
16-

docs/cli-spec.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ Configuration mapping examples:
3636
- `editor.default``--editor`
3737
- `browserAutomation.enabled``--browser-automation`, `AGENTS_WORKFLOW_BROWSER_AUTOMATION_ENABLED`
3838
- `browserAutomation.profile``--browser-profile`, `AGENTS_WORKFLOW_BROWSER_PROFILE`
39+
- `browserAutomation.chatgptUsername``--chatgpt-username`, `AGENTS_WORKFLOW_BROWSER_AUTOMATION_CHATGPT_USERNAME`
40+
- `codex.workspace``--codex-workspace`, `AGENTS_WORKFLOW_CODEX_WORKSPACE`
3941

4042
### Subcommands
4143

@@ -58,14 +60,15 @@ Task launch behavior in TUI:
5860

5961
#### 2) Tasks
6062

61-
- `aw task [create] [--prompt <TEXT> | --prompt-file <FILE>] [--repo <PATH|URL>] [--branch <NAME>] [--agent <TYPE>[@VERSION]] [--instances <N>] [--runtime <devcontainer|local|unsandboxed>] [--devcontainer-path <PATH>] [--labels k=v ...] [--delivery <pr|branch|patch>] [--target-branch <NAME>] [--browser-automation <true|false>] [--browser-profile <NAME>] [--yes]`
63+
- `aw task [create] [--prompt <TEXT> | --prompt-file <FILE>] [--repo <PATH|URL>] [--branch <NAME>] [--agent <TYPE>[@VERSION]] [--instances <N>] [--runtime <devcontainer|local|unsandboxed>] [--devcontainer-path <PATH>] [--labels k=v ...] [--delivery <pr|branch|patch>] [--target-branch <NAME>] [--browser-automation <true|false>] [--browser-profile <NAME>] [--chatgpt-username <NAME>] [--codex-workspace <WORKSPACE>] [--yes]`
6264

6365
Behavior:
6466

6567
- In local mode, prepares a per-task workspace using snapshot preference order (ZFS > Btrfs > Overlay > copy) and launches the agent.
6668
- In rest mode, calls `POST /api/v1/tasks` with the provided parameters.
6769
- Creates/updates a local PID-like session record when launching locally (see “Local Discovery”).
6870
- When `--browser-automation true` (default), launches site-specific browser automation (e.g., Codex) using the selected agent browser profile. When `false`, web automation is skipped.
71+
- Codex integration: if `--browser-profile` is not specified, discovers or creates a ChatGPT profile per `docs/browser-automation/codex.md`, optionally filtered by `--chatgpt-username`. Workspace is taken from `--codex-workspace` or config; branch is taken from `--branch`.
6972
- Branch autocompletion uses standard git protocol:
7073
- Local mode: `git for-each-ref` on the repo; cached with debounce.
7174
- REST mode: server uses `git ls-remote`/refs against admin-configured URL to populate its cache; CLI/Web query capability endpoints for suggestions.

docs/configuration.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,25 @@
11

2-
Thanks for the clarifications. I’ll revise the specification to include:
2+
## AW Configuration
3+
4+
### Overview
35

46
* `aw config` subcommand with Git-like interface for reading and updating configuration.
57
* Schema validation on both config file loading and CLI-based modification.
68
* Precedence for `~/.config` over `%APPDATA%` on Windows only when both are present.
7-
* Motivation and support for tracking the origin of each configuration value, with use cases such as: debug-level log reporting, enforced setting explanation, and editor pre-fill messages.
9+
* Motivation and support for tracking the origin of each configuration value, with use cases such as: debug-level log reporting, enforced setting explanation, and editor pre-fill mes
10+
sages.
11+
12+
Layered configuration supports system, user, project, and project-user scopes. Values can also be supplied via environment variables and CLI flags. See `docs/cli-spec.md` for flag mappings.
13+
14+
### Keys
15+
16+
- browserAutomation.enabled: boolean — enable/disable site automation.
17+
- browserAutomation.profile: string — preferred agent browser profile name.
18+
- browserAutomation.chatgptUsername: string — optional default ChatGPT username used for profile discovery.
19+
- codex.workspace: string — default Codex workspace to select before pressing "Code".
20+
21+
### Behavior
822

9-
I’ll update the specification accordingly and present a refined version that reflects these requirements.
23+
- CLI flags override environment, which override project-user, project, user, then system scope.
24+
- On Windows, `~/.config` takes precedence over `%APPDATA%` only when both are present.
25+
- The CLI can read, write, and explain config values via `aw config`.

0 commit comments

Comments
 (0)