|
| 1 | +# Stagehand Project |
| 2 | + |
| 3 | +This is a project that uses Stagehand, which amplifies Playwright with `act`, `extract`, and `observe` added to the Page class. |
| 4 | + |
| 5 | +`Stagehand` is a class that provides config, a `StagehandPage` object via `stagehand.page`, and a `StagehandContext` object via `stagehand.context`. |
| 6 | + |
| 7 | +`Page` is a class that extends the Playwright `Page` class and adds `act`, `extract`, and `observe` methods. |
| 8 | +`Context` is a class that extends the Playwright `BrowserContext` class. |
| 9 | + |
| 10 | +Use the following rules to write code for this project. |
| 11 | + |
| 12 | +- To take an action on the page like "click the sign in button", use Stagehand `act` like this: |
| 13 | + |
| 14 | +```typescript |
| 15 | +await page.act("Click the sign in button"); |
| 16 | +``` |
| 17 | + |
| 18 | +- To plan an instruction before taking an action, use Stagehand `observe` to get the action to execute. |
| 19 | + |
| 20 | +```typescript |
| 21 | +const [action] = await page.observe("Click the sign in button"); |
| 22 | +``` |
| 23 | + |
| 24 | +- The result of `observe` is an array of `ObserveResult` objects that can directly be used as params for `act` like this: |
| 25 | + |
| 26 | + ```typescript |
| 27 | + const [action] = await page.observe("Click the sign in button"); |
| 28 | + await page.act(action); |
| 29 | + ``` |
| 30 | + |
| 31 | +- When writing code that needs to extract data from the page, use Stagehand `extract`. Explicitly pass the following params by default: |
| 32 | + |
| 33 | +```typescript |
| 34 | +const { someValue } = await page.extract({ |
| 35 | + instruction: the instruction to execute, |
| 36 | + schema: z.object({ |
| 37 | + someValue: z.string(), |
| 38 | + }), // The schema to extract |
| 39 | +}); |
| 40 | +``` |
| 41 | + |
| 42 | +## Initialize |
| 43 | + |
| 44 | +```typescript |
| 45 | +import { Stagehand } from "@browserbasehq/stagehand"; |
| 46 | +import StagehandConfig from "./stagehand.config"; |
| 47 | + |
| 48 | +const stagehand = new Stagehand(StagehandConfig); |
| 49 | +await stagehand.init(); |
| 50 | + |
| 51 | +const page = stagehand.page; // Playwright Page with act, extract, and observe methods |
| 52 | +const context = stagehand.context; // Playwright BrowserContext |
| 53 | +``` |
| 54 | + |
| 55 | +## Act |
| 56 | + |
| 57 | +You can cache the results of `observe` and use them as params for `act` like this: |
| 58 | + |
| 59 | +```typescript |
| 60 | +const instruction = "Click the sign in button"; |
| 61 | +const cachedAction = await getCache(instruction); |
| 62 | + |
| 63 | +if (cachedAction) { |
| 64 | + await page.act(cachedAction); |
| 65 | +} else { |
| 66 | + try { |
| 67 | + const results = await page.observe(instruction); |
| 68 | + await setCache(instruction, results); |
| 69 | + await page.act(results[0]); |
| 70 | + } catch (error) { |
| 71 | + await page.act(instruction); // If the action is not cached, execute the instruction directly |
| 72 | + } |
| 73 | +} |
| 74 | +``` |
| 75 | + |
| 76 | +Be sure to cache the results of `observe` and use them as params for `act` to avoid unexpected DOM changes. Using `act` without caching will result in more unpredictable behavior. |
| 77 | + |
| 78 | +Act `action` should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input". |
| 79 | +AVOID actions that are more than one step, i.e. "Order me pizza" or "Type in the search bar and hit enter". |
| 80 | + |
| 81 | +## Extract |
| 82 | + |
| 83 | +If you are writing code that needs to extract data from the page, use Stagehand `extract`. |
| 84 | + |
| 85 | +```typescript |
| 86 | +const signInButtonText = await page.extract("extract the sign in button text"); |
| 87 | +``` |
| 88 | + |
| 89 | +You can also pass in params like an output schema in Zod, and a flag to use text extraction: |
| 90 | + |
| 91 | +```typescript |
| 92 | +const data = await page.extract({ |
| 93 | + instruction: "extract the sign in button text", |
| 94 | + schema: z.object({ |
| 95 | + text: z.string(), |
| 96 | + }), |
| 97 | +}); |
| 98 | +``` |
| 99 | + |
| 100 | +`schema` is a Zod schema that describes the data you want to extract. To extract an array, make sure to pass in a single object that contains the array, as follows: |
| 101 | + |
| 102 | +```typescript |
| 103 | +const data = await page.extract({ |
| 104 | + instruction: "extract the text inside all buttons", |
| 105 | + schema: z.object({ |
| 106 | + text: z.array(z.string()), |
| 107 | + }), |
| 108 | + useTextExtract: true, // Set true for larger-scale extractions (multiple paragraphs), or set false for small extractions (name, birthday, etc) |
| 109 | +}); |
| 110 | +``` |
| 111 | + |
| 112 | +## Agent |
| 113 | + |
| 114 | +Use the `agent` method to automonously execute larger tasks like "Get the stock price of NVDA" |
| 115 | + |
| 116 | +```typescript |
| 117 | +// Navigate to a website |
| 118 | +await stagehand.page.goto("https://www.google.com"); |
| 119 | + |
| 120 | +const agent = stagehand.agent({ |
| 121 | + // You can use either OpenAI or Anthropic |
| 122 | + provider: "openai", |
| 123 | + // The model to use (claude-3-7-sonnet-20250219 or claude-3-5-sonnet-20240620 for Anthropic) |
| 124 | + model: "computer-use-preview", |
| 125 | + |
| 126 | + // Customize the system prompt |
| 127 | + instructions: `You are a helpful assistant that can use a web browser. |
| 128 | + Do not ask follow up questions, the user will trust your judgement.`, |
| 129 | + |
| 130 | + // Customize the API key |
| 131 | + options: { |
| 132 | + apiKey: process.env.OPENAI_API_KEY, |
| 133 | + }, |
| 134 | +}); |
| 135 | + |
| 136 | +// Execute the agent |
| 137 | +await agent.execute( |
| 138 | + "Apply for a library card at the San Francisco Public Library" |
| 139 | +); |
| 140 | +``` |
0 commit comments