Skip to content

Commit 2e2a571

Browse files
authored
added langchain and crew ai stagehand integrations (#5)
1 parent 936fd29 commit 2e2a571

File tree

12 files changed

+2957
-21
lines changed

12 files changed

+2957
-21
lines changed

examples/integrations/crewai/README.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,26 @@ AI Agents rely on tools to gather rich contexts and perform actions to achieve s
1515
Browserbase provides a `BrowserbaseLoadTool` that Agents can use to retrieve context from complex webpages, enabling them to:
1616

1717
- Extract text from webpages using JavaScript or anti-bot mechanisms
18-
- Capture Images from webpages
18+
- Capture Images from webpages
19+
20+
## Stagehand Integration
21+
22+
Automate browser tasks using natural language instructions with CrewAI
23+
24+
This tool integrates the Stagehand Python SDK with CrewAI, allowing agents to interact with websites and automate browser tasks using natural language instructions.
25+
26+
### Description
27+
28+
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with the ability to control a real web browser and interact with websites using three core primitives:
29+
30+
- **Act**: Perform actions like clicking, typing, or navigating
31+
- **Extract**: Extract structured data from web pages
32+
- **Observe**: Identify and analyze elements on the page
33+
34+
### Requirements
35+
36+
Before using this tool, you will need:
37+
38+
- A Browserbase account with API key and project ID
39+
- An API key for an LLM (OpenAI or Anthropic Claude)
40+
- The Stagehand Python SDK installed
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# CrewAI Integration
2+
3+
Automate browser tasks using natural language instructions with CrewAI
4+
5+
This tool integrates the Stagehand Python SDK with CrewAI, allowing agents to interact with websites and automate browser tasks using natural language instructions.
6+
7+
## Description
8+
9+
The StagehandTool wraps the Stagehand Python SDK to provide CrewAI agents with the ability to control a real web browser and interact with websites using three core primitives:
10+
11+
- **Act**: Perform actions like clicking, typing, or navigating
12+
- **Extract**: Extract structured data from web pages
13+
- **Observe**: Identify and analyze elements on the page
14+
15+
## Requirements
16+
17+
Before using this tool, you will need:
18+
19+
- A Browserbase account with API key and project ID
20+
- An API key for an LLM (OpenAI or Anthropic Claude)
21+
- The Stagehand Python SDK installed
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
from crewai import Agent, Task, Crew
2+
from crewai_tools import StagehandTool
3+
from stagehand.schemas import AvailableModel
4+
import os
5+
6+
# Get API keys from environment
7+
browserbase_api_key = os.environ.get("BROWSERBASE_API_KEY")
8+
browserbase_project_id = os.environ.get("BROWSERBASE_PROJECT_ID")
9+
model_api_key = os.environ.get("OPENAI_API_KEY") # or ANTHROPIC_API_KEY
10+
11+
# Initialize the tool
12+
stagehand_tool = StagehandTool(
13+
api_key=browserbase_api_key,
14+
project_id=browserbase_project_id,
15+
model_api_key=model_api_key,
16+
model_name=AvailableModel.GPT_4O,
17+
)
18+
19+
# Create an agent
20+
researcher = Agent(
21+
role="Web Researcher",
22+
goal="Gather product information from an e-commerce website",
23+
backstory="I specialize in extracting and analyzing web data.",
24+
verbose=True,
25+
tools=[stagehand_tool],
26+
)
27+
28+
# Form submission task
29+
form_submission_task = Task(
30+
description="""
31+
Submit a contact form on example.com:
32+
1. Go to example.com/contact
33+
2. Fill out the contact form with:
34+
- Name: John Doe
35+
36+
- Subject: Information Request
37+
- Message: I would like to learn more about your services
38+
3. Submit the form
39+
4. Confirm the submission was successful
40+
""",
41+
agent=researcher,
42+
)
43+
44+
# Run the crew
45+
crew = Crew(
46+
agents=[researcher],
47+
tasks=[form_submission_task],
48+
verbose=True,
49+
)
50+
51+
result = crew.kickoff()
52+
print(result)
53+
54+
# Clean up resources
55+
stagehand_tool.close()
Lines changed: 8 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,13 @@
1-
# Langchain Integration
1+
# Langchain Integration Examples
22

3-
Add Browserbase to your Langchain application (Python).
3+
This directory contains examples of integrating Langchain with our web automation tools:
44

5-
## Introduction
5+
1. **Browserbase Integration**: A lightweight solution for web scraping and data extraction using our managed browser infrastructure.
66

7-
Langchain is a Python framework to build applications on top of large-language models (OpenAI, Llama, Gemini).
7+
2. **Stagehand Integration**: Full web automation capabilities using our open-source AI-powered browser automation SDK.
88

9-
Building on top of LLMs comes with many challenges:
9+
Choose the example that best fits your needs:
10+
- Use Browserbase for simple web scraping and data collection
11+
- Use Stagehand for complex automation workflows with AI-driven interactions
1012

11-
- Gathering and preparing the data (context) and providing memory to models
12-
- Orchestrating tasks to match LLM API requirements (ex, rate limiting, chunking)
13-
- Parse the different LLM result format
14-
15-
Langchain comes with a set of high-level concepts and tools to cope with those challenges:
16-
17-
- Retrieval modules such as Document Loaders or Text splitter help with gathering and preparing the data provided to the models
18-
- Model I/O is a set of tools that help to normalize the APIs across multiple models (ex: Prompt Templates)
19-
- Agents and Tools help to build reasoning (ex: how to answer based on provided context, what actions to take)
20-
- Chains help in orchestrating all the above
21-
22-
Browserbase provides a Document Loader to enable your Langchain application to browse the web to:
23-
24-
- Extract text or raw HTML, including from web pages using JavaScript or dynamically rendered text
25-
- Load images via screenshots
13+
See the respective directories for detailed implementation guides.
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Langchain Integration
2+
3+
Add Browserbase to your Langchain application (Python).
4+
5+
## Introduction
6+
7+
Langchain is a Python framework to build applications on top of large-language models (OpenAI, Llama, Gemini).
8+
9+
Building on top of LLMs comes with many challenges:
10+
11+
- Gathering and preparing the data (context) and providing memory to models
12+
- Orchestrating tasks to match LLM API requirements (ex, rate limiting, chunking)
13+
- Parse the different LLM result format
14+
15+
Langchain comes with a set of high-level concepts and tools to cope with those challenges:
16+
17+
- Retrieval modules such as Document Loaders or Text splitter help with gathering and preparing the data provided to the models
18+
- Model I/O is a set of tools that help to normalize the APIs across multiple models (ex: Prompt Templates)
19+
- Agents and Tools help to build reasoning (ex: how to answer based on provided context, what actions to take)
20+
- Chains help in orchestrating all the above
21+
22+
Browserbase provides a Document Loader to enable your Langchain application to browse the web to:
23+
24+
- Extract text or raw HTML, including from web pages using JavaScript or dynamically rendered text
25+
- Load images via screenshots
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Langchain JS
2+
3+
## Integrate Stagehand with Langchain JS
4+
5+
Stagehand can be integrated into Langchain JS by wrapping Stagehand's browser automation functionality with the StagehandToolkit.
6+
7+
This toolkit provides specialized tools such as navigate, act, extract, and observe, all powered by Stagehand's underlying capabilities.
8+
9+
For more details on this integration and how to work with Langchain, see the official Langchain documentation.
10+
11+
## Use the tools
12+
13+
- **stagehand_navigate**: Navigate to a specific URL.
14+
- **stagehand_act**: Perform browser automation tasks like clicking buttons and typing in fields.
15+
- **stagehand_extract**: Extract structured data from pages using Zod schemas.
16+
- **stagehand_observe**: Investigate the DOM for possible actions or relevant elements.
17+
18+
## Remote Browsers (Browserbase)
19+
20+
Instead of `env: "LOCAL"`, specify `env: "BROWSERBASE"` and pass in your Browserbase credentials through environment variables:
21+
- `BROWSERBASE_API_KEY`
22+
- `BROWSERBASE_PROJECT_ID`
23+
24+
## Using LangGraph Agents
25+
26+
The StagehandToolkit can also be plugged into LangGraph's existing agent system. This lets you orchestrate more complex flows by combining Stagehand's tools with other Langchain tools.
27+
28+
With the StagehandToolkit, you can quickly integrate natural-language-driven browser automation into workflows supported by Langchain. This enables use cases such as:
29+
30+
- Searching, extracting, and summarizing data from websites
31+
- Automating login flows
32+
- Navigating or clicking through forms based on instructions from a larger chain of agents
33+
34+
Consult Stagehand's and Langchain's official references for troubleshooting and advanced integrations or reach out to us on [Slack](https://stagehand.dev/slack).

0 commit comments

Comments
 (0)