Skip to content

Commit 01e798f

Browse files
authored
Merge pull request #1 from filip-michalsky/fm/mcp-stagehand
add stagehand mcp server
2 parents ed82d14 + d445732 commit 01e798f

File tree

6 files changed

+1862
-0
lines changed

6 files changed

+1862
-0
lines changed

stagehand/README.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# Stagehand MCP Server
2+
3+
![cover](../assets/stagehand-mcp.png)
4+
5+
A Model Context Protocol (MCP) server that provides AI-powered web automation capabilities using [Stagehand](https://github.com/browserbase/stagehand). This server enables LLMs to interact with web pages, perform actions, extract data, and observe possible actions in a real browser environment.
6+
7+
## Get Started
8+
9+
1. Run `npm install` to install the necessary dependencies, then run `npm run build` to get `dist/index.js`.
10+
11+
2. Set up your Claude Desktop configuration to use the server.
12+
13+
```json
14+
{
15+
"mcpServers": {
16+
"stagehand": {
17+
"command": "node",
18+
"args": ["path/to/mcp-server-browserbase/stagehand/dist/index.js"],
19+
"env": {
20+
"BROWSERBASE_API_KEY": "<YOUR_BROWSERBASE_API_KEY>",
21+
"BROWSERBASE_PROJECT_ID": "<YOUR_BROWSERBASE_PROJECT_ID>",
22+
"OPENAI_API_KEY": "<YOUR_OPENAI_API_KEY>",
23+
}
24+
}
25+
}
26+
}
27+
```
28+
29+
3. Restart your Claude Desktop app and you should see the tools available clicking the 🔨 icon.
30+
31+
4. Start using the tools! Below is a demo video of Claude doing a Google search for OpenAI using stagehand MCP server and Browserbase for a remote headless browser.
32+
33+
<div>
34+
<a href="https://www.loom.com/share/9fe52fd9ab24421191223645366ec1c5">
35+
<p>Stagehand MCP Server demo - Watch Video</p>
36+
</a>
37+
<a href="https://www.loom.com/share/9fe52fd9ab24421191223645366ec1c5">
38+
<img style="max-width:300px;" src="https://cdn.loom.com/sessions/thumbnails/9fe52fd9ab24421191223645366ec1c5-f1a228ffe52d8065-full-play.gif">
39+
</a>
40+
</div>
41+
42+
## Tools
43+
44+
### Stagehand commands
45+
46+
- **stagehand_navigate**
47+
- Navigate to any URL in the browser
48+
- Input:
49+
- `url` (string): The URL to navigate to
50+
51+
- **stagehand_act**
52+
- Perform an action on the web page
53+
- Inputs:
54+
- `action` (string): The action to perform (e.g., "click the login button")
55+
- `variables` (object, optional): Variables used in the action template
56+
57+
- **stagehand_extract**
58+
- Extract data from the web page based on an instruction and schema
59+
- Inputs:
60+
- `instruction` (string): Instruction for extraction (e.g., "extract the price of the item")
61+
- `schema` (object): JSON schema for the extracted data
62+
63+
- **stagehand_observe**
64+
- Observe actions that can be performed on the web page
65+
- Input:
66+
- `instruction` (string, optional): Instruction for observation
67+
68+
### Resources
69+
70+
The server provides access to two types of resources:
71+
72+
1. **Console Logs** (`console://logs`)
73+
74+
- Browser console output in text format
75+
- Includes all console messages from the browser
76+
77+
2. **Screenshots** (`screenshot://<name>`)
78+
- PNG images of captured screenshots
79+
- Accessible via the screenshot name specified during capture
80+
81+
## Key Features
82+
83+
- AI-powered web automation
84+
- Perform actions on web pages
85+
- Extract structured data from web pages
86+
- Observe possible actions on web pages
87+
- Simple and extensible API
88+
- Model-agnostic support for various LLM providers
89+
90+
## License
91+
92+
Licensed under the MIT License.
93+
94+
Copyright 2024 Browserbase, Inc.

0 commit comments

Comments
 (0)