Skip to content

Commit 1d3a075

Browse files
add stagehand mcp server
1 parent ed82d14 commit 1d3a075

File tree

6 files changed

+1863
-0
lines changed

6 files changed

+1863
-0
lines changed

stagehand/README.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Stagehand MCP Server
2+
3+
// TODO - ADD a nice and eye-watering COVER IMAGE
4+
<!-- ![cover](../assets/browserbase-mcp.png) -->
5+
6+
A Model Context Protocol (MCP) server that provides AI-powered web automation capabilities using [Stagehand](https://github.com/browserbase/stagehand). This server enables LLMs to interact with web pages, perform actions, extract data, and observe possible actions in a real browser environment.
7+
8+
## Get Started
9+
10+
1. Run `npm install` to install the necessary dependencies, then run `npm run build` to get `dist/index.js`.
11+
12+
2. Set up your Claude Desktop configuration to use the server.
13+
14+
```json
15+
{
16+
"mcpServers": {
17+
"stagehand": {
18+
"command": "node",
19+
"args": ["path/to/mcp-server-browserbase/stagehand/dist/index.js"],
20+
"env": {
21+
"BROWSERBASE_API_KEY": "<YOUR_BROWSERBASE_API_KEY>",
22+
"BROWSERBASE_PROJECT_ID": "<YOUR_BROWSERBASE_PROJECT_ID>",
23+
"OPENAI_API_KEY": "<YOUR_OPENAI_API_KEY>",
24+
}
25+
}
26+
}
27+
}
28+
```
29+
30+
3. Restart your Claude Desktop app and you should see the tools available clicking the 🔨 icon.
31+
32+
4. Start using the tools! Below is a demo video of Claude doing a Google search for OpenAI using stagehand MCP server and Browserbase for a remote headless browser.
33+
34+
<div>
35+
<a href="https://www.loom.com/share/9fe52fd9ab24421191223645366ec1c5">
36+
<p>Stagehand MCP Server demo - Watch Video</p>
37+
</a>
38+
<a href="https://www.loom.com/share/9fe52fd9ab24421191223645366ec1c5">
39+
<img style="max-width:300px;" src="https://cdn.loom.com/sessions/thumbnails/9fe52fd9ab24421191223645366ec1c5-f1a228ffe52d8065-full-play.gif">
40+
</a>
41+
</div>
42+
43+
## Tools
44+
45+
### Stagehand commands
46+
47+
- **stagehand_navigate**
48+
- Navigate to any URL in the browser
49+
- Input:
50+
- `url` (string): The URL to navigate to
51+
52+
- **stagehand_act**
53+
- Perform an action on the web page
54+
- Inputs:
55+
- `action` (string): The action to perform (e.g., "click the login button")
56+
- `variables` (object, optional): Variables used in the action template
57+
58+
- **stagehand_extract**
59+
- Extract data from the web page based on an instruction and schema
60+
- Inputs:
61+
- `instruction` (string): Instruction for extraction (e.g., "extract the price of the item")
62+
- `schema` (object): JSON schema for the extracted data
63+
64+
- **stagehand_observe**
65+
- Observe actions that can be performed on the web page
66+
- Input:
67+
- `instruction` (string, optional): Instruction for observation
68+
69+
### Resources
70+
71+
The server provides access to two types of resources:
72+
73+
1. **Console Logs** (`console://logs`)
74+
75+
- Browser console output in text format
76+
- Includes all console messages from the browser
77+
78+
2. **Screenshots** (`screenshot://<name>`)
79+
- PNG images of captured screenshots
80+
- Accessible via the screenshot name specified during capture
81+
82+
## Key Features
83+
84+
- AI-powered web automation
85+
- Perform actions on web pages
86+
- Extract structured data from web pages
87+
- Observe possible actions on web pages
88+
- Simple and extensible API
89+
- Model-agnostic support for various LLM providers
90+
91+
## License
92+
93+
Licensed under the MIT License.
94+
95+
Copyright 2024 Browserbase, Inc.

0 commit comments

Comments
 (0)