Skip to content

Commit a781a32

Browse files
authored
portia integration (#8)
1 parent 3b3a1cb commit a781a32

File tree

7 files changed

+418
-0
lines changed

7 files changed

+418
-0
lines changed

README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,21 @@ Powerful web automation combining Browserbase's Stagehand with Mastra's AI agent
6161
#### [**Browser-Use Integration**](./examples/integrations/browser-use/README.md)
6262
Streamlined browser automation for AI applications with a focus on simplicity and reliability.
6363

64+
#### [**Portia AI Integration**](./examples/integrations/portia/README.md)
65+
Build intelligent web agents with **persistent authentication** using Portia AI's multi-agent framework. Portia enables both multi-agent task planning with human feedback and stateful multi-agent task execution with human control.
66+
67+
**Key Features:**
68+
- **Persistent Authentication** - Agents can authenticate once and reuse sessions
69+
- **Human-in-the-Loop** - Structured clarification system for authentication requests
70+
- **Multi-User Support** - Isolated browser sessions per end user
71+
- **Production-Ready** - Open-source framework designed for reliable agent deployment
72+
73+
**Perfect for:**
74+
- LinkedIn automation with user authentication
75+
- E-commerce agents that need to log into shopping sites
76+
- Data extraction from authenticated dashboards
77+
- Any web task requiring persistent user sessions
78+
6479
### 🏗️ Development & Deployment Platforms
6580

6681
#### [**Vercel AI Integration**](./examples/integrations/vercel/README.md)
@@ -111,6 +126,7 @@ integrations/
111126
│ ├── mastra/ # Mastra AI agent integration
112127
│ ├── browser-use/ # Simplified browser automation
113128
│ ├── braintrust/ # Evaluation and testing tools
129+
│ ├── portia/ # Portia AI multi-agent framework
114130
│ ├── agno/ # AI-powered web scraping agents
115131
│ ├── mongodb/ # MongoDB data extraction & storage
116132
│ └── agentkit/ # AgentKit implementations
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
BROWSERBASE_API_KEY=
2+
BROWSERBASE_PROJECT_ID=
3+
4+
ANTHROPIC_API_KEY=
5+
PORTIA_API_KEY=
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
/venv
Lines changed: 208 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,208 @@
1+
# Browserbase x Portia AI Integration
2+
3+
**Build intelligent web agents with persistent authentication using Portia AI's multi-agent framework and Browserbase's headless browser infrastructure.**
4+
5+
Portia AI is an open-source, multi-agent framework for running reliable production-grade agents ([**github repo here↗**](https://github.com/portiaAI/portia-sdk-python)). Its core tenets are to enable both multi-agent task planning with human feedback and stateful multi-agent task execution with human control.
6+
7+
## 🚀 What Makes This Integration Special
8+
9+
Portia AI offers an open-source browser agent implementation using Browserbase to **enable persistent authentication**. When the browser agent needs to authenticate to achieve a task, it leverages Portia's structured human:agent abstraction called a [`clarification`](https://docs.portialabs.ai/understand-clarifications) and presents the end user with a [browserbase live session URL](https://docs.browserbase.com/guides/authentication#use-the-session-live-view-to-login) that they can use to sign in.
10+
11+
Portia incorporates the concept of end users with the [`EndUser`](https://docs.portialabs.ai/manage-end-users) abstraction into the framework to delineate the Browserbase sessions associated with them. This enables developers to create powerful applications that can be used by anyone.
12+
13+
> **Note:** `clarifications` and end-users in the Portia framework can also be used to implement OAuth for API-based tools.
14+
15+
## 🎯 Use Cases
16+
17+
Here are some examples of the kinds of queries that can be handled in 20 lines of code with the Portia / Browserbase integration:
18+
19+
- Send a message to Bob Smith on LinkedIn asking him if he's free on Tuesday for a meeting
20+
- Get my Google Doc shopping list and add all items in it to my shopping trolley on the Walmart website
21+
- Book me unto the 8am hot yoga class
22+
- Star a GitHub repository after authenticating
23+
- Extract data from authenticated dashboards
24+
25+
## 🎥 Demo Video
26+
27+
Watch how you can make a LinkedIn agent with Browserbase and Portia AI:
28+
29+
[![LinkedIn Agent Demo](https://img.youtube.com/vi/hSq8Ww-hagg/0.jpg)](https://www.youtube.com/watch?v=hSq8Ww-hagg)
30+
31+
## 🔧 Prerequisites
32+
33+
- **Browserbase Account**: Get your API key from the [Dashboard's Settings tab](https://www.browserbase.com/settings)
34+
- **LLM API Key**: Anthropic (ANTHROPIC_API_KEY), OpenAI (OPENAI_API_KEY), Google (GOOGLE_API_KEY), or [local LLM](https://docs.portialabs.ai/manage-config#api-keys)
35+
- **Python 3.8+**
36+
- **Paid Browserbase subscription** (required for authentication features)
37+
38+
## 🚀 Quick Start
39+
40+
### 1. Set Environment Variables
41+
42+
```bash
43+
export BROWSERBASE_API_KEY="your_browserbase_api_key"
44+
export BROWSERBASE_PROJECT_ID="your_project_id"
45+
export ANTHROPIC_API_KEY="your_anthropic_api_key" # or OPENAI_API_KEY, GOOGLE_API_KEY
46+
```
47+
48+
### 2. Install Portia with Browserbase
49+
50+
```bash
51+
pip install portia-sdk-python[tools-browser-browserbase]
52+
```
53+
54+
### 3. Basic Agent (No Authentication)
55+
56+
This example works with the free trial version of Browserbase:
57+
58+
```python
59+
from dotenv import load_dotenv
60+
61+
from portia import (
62+
Config,
63+
LLMProvider,
64+
Portia,
65+
PortiaToolRegistry,
66+
StorageClass,
67+
)
68+
from portia.cli import CLIExecutionHooks
69+
from portia.open_source_tools.browser_tool import BrowserTool, BrowserInfrastructureOption
70+
71+
load_dotenv(override=True)
72+
73+
task = "Go to https://www.npr.org and get the headline news story"
74+
75+
my_config = Config.from_default(
76+
storage_class=StorageClass.MEMORY,
77+
llm_provider=LLMProvider.ANTHROPIC
78+
)
79+
80+
portia = Portia(
81+
config=my_config,
82+
tools=PortiaToolRegistry(my_config) + [
83+
BrowserTool(infrastructure_option=BrowserInfrastructureOption.REMOTE)
84+
],
85+
execution_hooks=CLIExecutionHooks()
86+
)
87+
88+
plan_run = portia.run(task, end_user="end_user1")
89+
```
90+
91+
### 4. Advanced Agent with Authentication
92+
93+
This example demonstrates persistent authentication capabilities:
94+
95+
```python
96+
from dotenv import load_dotenv
97+
98+
from portia import (
99+
Config,
100+
LLMProvider,
101+
Portia,
102+
PortiaToolRegistry,
103+
StorageClass,
104+
)
105+
from portia.cli import CLIExecutionHooks
106+
from portia.open_source_tools.browser_tool import BrowserToolForUrl, BrowserInfrastructureOption
107+
108+
load_dotenv(override=True)
109+
110+
# The task that you want the agent to do
111+
task = "Find the github repo for portia-sdk-python and star it if it's not already starred."
112+
113+
my_config = Config.from_default(
114+
storage_class=StorageClass.MEMORY,
115+
llm_provider=LLMProvider.ANTHROPIC
116+
)
117+
118+
# Requires a paid browserbase subscription for authentication handling
119+
portia = Portia(
120+
config=my_config,
121+
tools=PortiaToolRegistry(my_config) + [
122+
BrowserToolForUrl(
123+
url="https://www.github.com",
124+
infrastructure_option=BrowserInfrastructureOption.REMOTE
125+
)
126+
],
127+
# CLI execution hooks mean authentication requests will be output to the CLI
128+
execution_hooks=CLIExecutionHooks()
129+
)
130+
131+
plan_run = portia.run(task, end_user="end_user")
132+
```
133+
134+
## 🔐 How Authentication Works
135+
136+
When a browser tool encounters a page that requires authentication, it will raise a clarification request to the user. The authentication flow works as follows:
137+
138+
![Browser authentication with clarifications](../../../images/integrations/portia/browser_auth.png)
139+
140+
1. **Agent encounters authentication requirement**
141+
2. **Clarification request is raised** with a Browserbase live session URL
142+
3. **User authenticates** through the live session
143+
4. **Cookies are persisted** for future agent runs until they expire
144+
5. **Agent continues** with the authenticated session
145+
146+
## 🛠️ Advanced Configuration
147+
148+
### Custom Execution Hooks
149+
150+
You can customize how clarifications are handled in your application:
151+
152+
```python
153+
from portia.execution_hooks import ExecutionHooks
154+
155+
class CustomExecutionHooks(ExecutionHooks):
156+
def on_clarification_request(self, clarification):
157+
# Custom logic for handling authentication requests
158+
# e.g., send notification, log to database, etc.
159+
pass
160+
161+
portia = Portia(
162+
config=my_config,
163+
tools=tools,
164+
execution_hooks=CustomExecutionHooks()
165+
)
166+
```
167+
168+
### Multiple End Users
169+
170+
Manage sessions for different users:
171+
172+
```python
173+
# Different users get isolated browser sessions
174+
plan_run_user1 = portia.run(task, end_user="user1")
175+
plan_run_user2 = portia.run(task, end_user="user2")
176+
```
177+
178+
### Storage Options
179+
180+
Choose different storage backends for persistence:
181+
182+
```python
183+
# In-memory (default)
184+
config = Config.from_default(storage_class=StorageClass.MEMORY)
185+
186+
# File-based storage
187+
config = Config.from_default(storage_class=StorageClass.FILE)
188+
189+
# Database storage (requires additional setup)
190+
config = Config.from_default(storage_class=StorageClass.DATABASE)
191+
```
192+
193+
## 📚 Additional Resources
194+
195+
- **[Portia AI Documentation](https://docs.portialabs.ai/)**
196+
- **[Portia SDK Python GitHub](https://github.com/portiaAI/portia-sdk-python)**
197+
- **[Browserbase Documentation](https://docs.browserbase.com)**
198+
- **[Understanding Clarifications](https://docs.portialabs.ai/understand-clarifications)**
199+
- **[Managing End Users](https://docs.portialabs.ai/manage-end-users)**
200+
201+
## 🤝 Support
202+
203+
- **Portia AI**: [GitHub Issues](https://github.com/portiaAI/portia-sdk-python/issues)
204+
- **Browserbase**: [[email protected]](mailto:[email protected])
205+
206+
## 📄 License
207+
208+
This integration example is licensed under the MIT License. See the main repository LICENSE file for details.
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
from dotenv import load_dotenv
2+
3+
from portia import (
4+
Config,
5+
LLMProvider,
6+
Portia,
7+
PortiaToolRegistry,
8+
StorageClass,
9+
)
10+
from portia.cli import CLIExecutionHooks
11+
from portia.open_source_tools.browser_tool import BrowserToolForUrl, BrowserInfrastructureOption
12+
13+
load_dotenv(override=True)
14+
15+
# The task that you want the agent to do
16+
task = ("Find the github repo for portia-sdk-python and star it if it's not already starred.")
17+
18+
# Requires an anthropic API key, ANTHROPIC_API_KEY or use any other LLM.
19+
my_config = Config.from_default(storage_class=StorageClass.MEMORY,
20+
llm_provider=LLMProvider.ANTHROPIC)
21+
22+
# Requires a paid browserbase subscription for authentication handling
23+
portia = Portia(config=my_config,
24+
tools=PortiaToolRegistry(my_config) + [
25+
BrowserToolForUrl(url="https://www.github.com",
26+
infrastructure_option=BrowserInfrastructureOption.REMOTE)],
27+
# CLI execution hooks mean authentication requests will be output to the CLI. You can customize these in your application.
28+
execution_hooks=CLIExecutionHooks())
29+
30+
plan_run = portia.run(task, end_user="end_user")

0 commit comments

Comments
 (0)