The browser built for AI agents.
Oryn (Open Runtime for Intentful Navigation) is a browser automation system designed specifically for AI agents. Instead of forcing agents to understand screenshots, parse HTML, or construct complex function calls, Oryn provides a semantic intent language that speaks naturally to how agents think about web interaction.
Traditional agent-browser interfaces fail because they expose the wrong abstraction. Oryn fixes this by providing:
- Semantic observations instead of raw HTML or pixels
- Intent language instead of rigid function schemas
- Consistent behavior across all deployment environments
Oryn provides a single unified binary oryn that adapts to any environment:
# Headless Mode (Cloud/CI)
oryn headless
# Embedded Mode (IoT/Edge)
oryn embedded --driver-url http://localhost:8080
# Remote Mode (Debug/Assistance)
oryn remote --port 9001See the User Guide for full usage instructions.
Instead of parsing HTML or analyzing screenshots, agents interact naturally:
goto github.com/login
observe
@ github.com/login "Sign in to GitHub"
[1] input/email "Username or email" {required}
[2] input/password "Password" {required}
[3] button/submit "Sign in" {primary}
type 1 "myusername"
type 2 "mypassword"
click 3
The agent sees labeled interactive elements and issues simple commands. No CSS selectors, no XPath, no DOM traversal—just intent.
Current approaches force agents into unnatural workflows:
Screenshot/Vision: Expensive inference, unreliable text extraction, no understanding of interactive state
HTML Parsing: Thousands of tokens of markup, complex reasoning about visibility and interactivity
Function Calls: Rigid schemas, verbose definitions, no tolerance for natural variation
Oryn presents the web as agents need to see it:
- Interactive elements are labeled and typed
- State is explicit (required, disabled, checked)
- Patterns are detected (login forms, search boxes)
- Commands are forgiving and semantic
Choose oryn-e (Embedded) when:
- Running on Raspberry Pi or IoT hardware
- Deploying to resource-constrained containers
- RAM is precious (~50MB footprint)
- WebKit compatibility is sufficient
Choose oryn-h (Headless) when:
- Running cloud-based automation
- Maximum browser compatibility needed (~99%)
- Handling complex SPAs
- Network interception required
Choose oryn-r (Remote) when:
- Agent needs user's logged-in sessions
- Anti-bot bypass is critical (real browser fingerprint)
- User wants to watch agent actions
- Interactive assistance workflows
Choose oryn-w (WASM Extension) when:
- No server/backend infrastructure available
- Instant setup required (load extension, go)
- Quick one-off commands on any webpage
- Minimal resource footprint needed (<10MB memory)
- See extension-w/README.md for details
| Document | Description |
|---|---|
| USER_GUIDE.md | Full installation and usage guide |
| GOOGLE_ADK_TUTORIAL.md | Integration with Google ADK Agents |
| SPEC-INTENT-LANGUAGE.md | The agent-facing command protocol |
| SPEC-SCANNER-PROTOCOL.md | Internal browser-scanner interface |
| SPEC-UNIFIED.md | Architecture and mode comparison |
| PRODUCT-INTRO.md | Product overview and vision |
GitHub Actions workflows are defined under .github/workflows/:
ci-rust.yml:cargo fmt,cargo clippy,cargo teston PRs andmain.ci-js.yml: scanner checks plusextension-wlint/tests on PRs andmain.ci-e2e-quick.yml: quick Docker-backed E2E smoke suite on PRs andmain.nightly-full-e2e.yml: scheduled/manual full E2E matrix (oryn-h,oryn-e-*,oryn-r).preview-release.yml: tag-driven preview packaging and release publishing forpreview-v*tags.docs-pages.yml: MkDocs build and deploy to GitHub Pages.
For local workflow validation with act, repo defaults are in .actrc.
oryn/
├── crates/
│ ├── oryn-core/ # Shared protocol and types
│ ├── oryn-scanner/ # Universal JavaScript scanner
│ ├── oryn-e/ # Embedded mode binary
│ ├── oryn-h/ # Headless mode binary
│ └── oryn-r/ # Remote mode binary
├── extension/ # Browser extension for remote mode
├── docs/
│ ├── SPEC-INTENT-LANGUAGE.md
│ ├── SPEC-SCANNER-PROTOCOL.md
│ └── SPEC-UNIFIED.md
└── README.md
A single JavaScript implementation runs inside all browser contexts—WebKit, Chromium, and browser extensions. This guarantees behavioral consistency. The Rust layer never parses HTML directly; it only processes scanner JSON responses.
Commands are designed for agent ergonomics:
- Case-insensitive, forgiving syntax
- Multiple targeting strategies (ID, text, role, selector)
- Multi-level abstraction (direct → semantic → intent)
- Token-efficient responses
All binaries implement the same interface. Agents can switch modes without changing their logic. The same scanner code, the same protocol, the same behavior.
oryn-e (Embedded)
- Linux with WPE WebKit / COG
- ~50MB RAM available
oryn-h (Headless)
- Chromium browser installed
- ~300MB+ RAM available
oryn-r (Remote)
- Browser extension installed
- WebSocket connection to server
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
Oryn: Intent, not implementation.