Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
76cc39b
allow free modelclientoptions
sameelarif Aug 29, 2025
28f4b6e
Merge branch 'main' into sameel/stg-692-azurebedrock-api-integration-…
sameelarif Sep 3, 2025
1a415bb
Merge branch 'main' into sameel/stg-692-azurebedrock-api-integration-…
sameelarif Sep 9, 2025
04fb315
change zod ver to working build
sameelarif Sep 10, 2025
8eccd56
send client options on every request
sameelarif Sep 15, 2025
c6a752d
test bedrock file
filip-michalsky Sep 23, 2025
2931804
add azure test file
filip-michalsky Sep 23, 2025
2f3b8b9
fix bedrock test
sameelarif Sep 24, 2025
be8b7a4
Merge branch 'main' into sameel/stg-692-azurebedrock-api-integration-…
sameelarif Sep 25, 2025
27c722c
Update pnpm-lock.yaml
sameelarif Sep 25, 2025
69c3d93
better modelclientoption api handling
sameelarif Sep 26, 2025
467dade
dont override region
sameelarif Sep 26, 2025
0735ca3
fix bedrock example
sameelarif Sep 26, 2025
76b44ae
lint
sameelarif Sep 30, 2025
18937ee
read aws creds from client options obj
sameelarif Sep 30, 2025
0af4acf
update evals cli docs (#1096)
miguelg719 Sep 26, 2025
c762944
adding support for new claude 4.5 sonnet agent model (#1099)
Kylejeong2 Sep 29, 2025
4bd7412
properly convert custom / mcp tools to anthropic cua format (#1103)
tkattkat Oct 1, 2025
ce07cfa
Add current date and page url to agent context (#1102)
miguelg719 Oct 1, 2025
06ae0e6
Additional agent logging (#1104)
miguelg719 Oct 1, 2025
9fe40fd
fix system prompt
miguelg719 Oct 2, 2025
938b51c
remove dup log
miguelg719 Oct 2, 2025
607b4c3
pass modelClientOptions for stagehand agent
miguelg719 Oct 2, 2025
adec13c
Merge branch 'main' into sameel/stg-692-azurebedrock-api-integration-…
sameelarif Oct 3, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -233,15 +233,13 @@
We're thrilled to announce the release of Stagehand 2.0, bringing significant improvements to make browser automation more powerful, faster, and easier to use than ever before.

### 🚀 New Features

- **Introducing `stagehand.agent`**: A powerful new way to integrate SOTA Computer use models or Browserbase's [Open Operator](https://operator.browserbase.com) into Stagehand with one line of code! Perfect for multi-step workflows and complex interactions. [Learn more](https://docs.stagehand.dev/concepts/agent)
- **Lightning-fast `act` and `extract`**: Major performance improvements to make your automations run significantly faster.
- **Enhanced Logging**: Better visibility into what's happening during automation with improved logging and debugging capabilities.
- **Comprehensive Documentation**: A completely revamped documentation site with better examples, guides, and best practices.
- **Improved Error Handling**: More descriptive errors and better error recovery to help you debug issues faster.

### 🛠️ Developer Experience

- **Better TypeScript Support**: Enhanced type definitions and better IDE integration
- **Better Error Messages**: Clearer, more actionable error messages to help you debug faster
- **Improved Caching**: More reliable action caching for better performance
Expand Down Expand Up @@ -502,7 +500,6 @@
- [#316](https://github.com/browserbase/stagehand/pull/316) [`902e633`](https://github.com/browserbase/stagehand/commit/902e633e126a58b80b757ea0ecada01a7675a473) Thanks [@kamath](https://github.com/kamath)! - rename browserbaseResumeSessionID -> browserbaseSessionID

- [#296](https://github.com/browserbase/stagehand/pull/296) [`f11da27`](https://github.com/browserbase/stagehand/commit/f11da27a20409c240ceeea2003d520f676def61a) Thanks [@kamath](https://github.com/kamath)! - - Deprecate fields in `init` in favor of constructor options

- Deprecate `initFromPage` in favor of `browserbaseResumeSessionID` in constructor
- Rename `browserBaseSessionCreateParams` -> `browserbaseSessionCreateParams`

Expand Down
6 changes: 5 additions & 1 deletion lib/StagehandPage.ts
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,7 @@ ${scriptContent} \
const result = await this.api.act({
...observeResult,
frameId: this.rootFrameId,
modelClientOptions: this.stagehand["modelClientOptions"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need this once we add self-healing

});
this.stagehand.addToHistory("act", observeResult, result);
return result;
Expand Down Expand Up @@ -836,7 +837,10 @@ ${scriptContent} \
if (!instructionOrOptions) {
let result: ExtractResult<T>;
if (this.api) {
result = await this.api.extract<T>({ frameId: this.rootFrameId });
result = await this.api.extract<T>({
frameId: this.rootFrameId,
modelClientOptions: this.stagehand["modelClientOptions"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this required?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because otherwise it doesn't get sent to the API. We need this param on all api calls now

});
} else {
result = await this.extractHandler.extract();
}
Expand Down
3 changes: 2 additions & 1 deletion lib/a11y/utils.ts
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,8 @@ export async function buildBackendIdMaps(
if (n.contentDocument && locate(n.contentDocument)) return true;
return false;
} else {
if (n.backendNodeId === backendNodeId) return (iframeNode = n), true;
if (n.backendNodeId === backendNodeId)
return ((iframeNode = n), true);
return (
(n.children?.some(locate) ?? false) ||
(n.contentDocument ? locate(n.contentDocument) : false)
Expand Down
30 changes: 20 additions & 10 deletions lib/api.ts
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ export class StagehandAPI {

async init({
modelName,
modelApiKey,
domSettleTimeoutMs,
verbose,
debugDom,
Expand All @@ -59,11 +58,6 @@ export class StagehandAPI {
browserbaseSessionCreateParams,
browserbaseSessionID,
}: StartSessionParams): Promise<StartSessionResult> {
if (!modelApiKey) {
throw new StagehandAPIError("modelApiKey is required");
}
this.modelApiKey = modelApiKey;

const region = browserbaseSessionCreateParams?.region;
if (region && region !== "us-west-2") {
return { sessionId: browserbaseSessionID ?? null, available: false };
Expand Down Expand Up @@ -186,10 +180,19 @@ export class StagehandAPI {
const queryString = urlParams.toString();
const url = `/sessions/${this.sessionId}/${method}${queryString ? `?${queryString}` : ""}`;

const response = await this.request(url, {
method: "POST",
body: JSON.stringify(args),
});
// Extract modelClientOptions from args if present
const modelClientOptions = (
args as { modelClientOptions?: Record<string, unknown> }
)?.modelClientOptions;

const response = await this.request(
url,
{
method: "POST",
body: JSON.stringify(args),
},
modelClientOptions,
);

if (!response.ok) {
const errorBody = await response.text();
Expand Down Expand Up @@ -248,6 +251,7 @@ export class StagehandAPI {
private async request(
path: string,
options: RequestInit = {},
modelClientOptions?: Record<string, unknown>,
): Promise<Response> {
const defaultHeaders: Record<string, string> = {
"x-bb-api-key": this.apiKey,
Expand All @@ -261,6 +265,12 @@ export class StagehandAPI {
"x-sdk-version": STAGEHAND_VERSION,
};

// Add modelClientOptions as a header if provided
if (modelClientOptions) {
defaultHeaders["x-model-client-options"] =
JSON.stringify(modelClientOptions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why send it as a header? modelClientOptions already gets sent in the payload. A stringified json in a header is probably not the move

}

if (options.method === "POST" && options.body) {
defaultHeaders["Content-Type"] = "application/json";
}
Expand Down
46 changes: 30 additions & 16 deletions lib/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,11 @@ import { LLMProvider } from "./llm/LLMProvider";
import { StagehandLogger } from "./logger";
import { connectToMCPServer } from "./mcp/connection";
import { resolveTools } from "./mcp/utils";
import { isRunningInBun, loadApiKeyFromEnv } from "./utils";
import {
isRunningInBun,
loadApiKeyFromEnv,
loadBedrockClientOptions,
} from "./utils";

dotenv.config({ path: ".env" });

Expand Down Expand Up @@ -587,28 +591,38 @@ export class Stagehand {
if (!modelClientOptions?.apiKey) {
// If no API key is provided, try to load it from the environment
if (LLMProvider.getModelProvider(this.modelName) === "aisdk") {
modelApiKey = loadApiKeyFromEnv(
this.modelName.split("/")[0],
this.logger,
);
const provider = this.modelName.split("/")[0];

// Special handling for Amazon Bedrock's complex authentication
if (provider === "bedrock") {
const bedrockOptions = loadBedrockClientOptions(this.logger);
this.modelClientOptions = {
...modelClientOptions,
...bedrockOptions,
};
} else {
// Standard single API key handling for other AISDK providers
modelApiKey = loadApiKeyFromEnv(provider, this.logger);
this.modelClientOptions = {
...modelClientOptions,
apiKey: modelApiKey,
};
}
} else {
// Temporary add for legacy providers
modelApiKey =
LLMProvider.getModelProvider(this.modelName) === "openai"
? process.env.OPENAI_API_KEY ||
this.llmClient?.clientOptions?.apiKey
? process.env.OPENAI_API_KEY
: LLMProvider.getModelProvider(this.modelName) === "anthropic"
? process.env.ANTHROPIC_API_KEY ||
this.llmClient?.clientOptions?.apiKey
? process.env.ANTHROPIC_API_KEY
: LLMProvider.getModelProvider(this.modelName) === "google"
? process.env.GOOGLE_API_KEY ||
this.llmClient?.clientOptions?.apiKey
? process.env.GOOGLE_API_KEY
: undefined;
this.modelClientOptions = {
...modelClientOptions,
apiKey: modelApiKey,
};
}
this.modelClientOptions = {
...modelClientOptions,
apiKey: modelApiKey,
};
} else {
this.modelClientOptions = modelClientOptions;
}
Expand Down Expand Up @@ -756,7 +770,7 @@ export class Stagehand {
logger: this.logger,
});

const modelApiKey = this.modelClientOptions?.apiKey;
const modelApiKey = this.modelClientOptions?.apiKey as string;
const { sessionId, available } = await this.apiClient.init({
modelName: this.modelName,
modelApiKey: modelApiKey,
Expand Down
4 changes: 2 additions & 2 deletions lib/llm/AnthropicClient.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { CreateChatCompletionResponseError } from "@/types/stagehandErrors";
import Anthropic, { ClientOptions } from "@anthropic-ai/sdk";
import {
ImageBlockParam,
Expand All @@ -14,14 +15,13 @@ import {
LLMClient,
LLMResponse,
} from "./LLMClient";
import { CreateChatCompletionResponseError } from "@/types/stagehandErrors";

export class AnthropicClient extends LLMClient {
public type = "anthropic" as const;
private client: Anthropic;
private cache: LLMCache | undefined;
private enableCaching: boolean;
public clientOptions: ClientOptions;
public clientOptions?: ClientOptions;

constructor({
enableCaching = false,
Expand Down
6 changes: 3 additions & 3 deletions lib/llm/CerebrasClient.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import OpenAI from "openai";
import { CreateChatCompletionResponseError } from "@/types/stagehandErrors";
import type { ClientOptions } from "openai";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Import ClientOptions from openai but use OpenAI.ClientOptions in constructor parameter - consider using consistent type reference

Suggested change
import type { ClientOptions } from "openai";
clientOptions?: ClientOptions;

import OpenAI from "openai";
import { zodToJsonSchema } from "zod-to-json-schema";
import { LogLine } from "../../types/log";
import { AvailableModel } from "../../types/model";
Expand All @@ -10,14 +11,13 @@ import {
LLMClient,
LLMResponse,
} from "./LLMClient";
import { CreateChatCompletionResponseError } from "@/types/stagehandErrors";

export class CerebrasClient extends LLMClient {
public type = "cerebras" as const;
private client: OpenAI;
private cache: LLMCache | undefined;
private enableCaching: boolean;
public clientOptions: ClientOptions;
public clientOptions?: ClientOptions;
public hasVision = false;

constructor({
Expand Down
2 changes: 1 addition & 1 deletion lib/llm/GoogleClient.ts
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ export class GoogleClient extends LLMClient {
clientOptions.apiKey = loadApiKeyFromEnv("google_legacy", logger);
}
this.clientOptions = clientOptions;
this.client = new GoogleGenAI({ apiKey: clientOptions.apiKey });
this.client = new GoogleGenAI({ apiKey: clientOptions.apiKey as string });
this.cache = cache;
this.enableCaching = enableCaching;
this.modelName = modelName;
Expand Down
3 changes: 1 addition & 2 deletions lib/llm/LLMClient.ts
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import {
} from "ai";
import { ZodType } from "zod/v3";
import { LogLine } from "../../types/log";
import { AvailableModel, ClientOptions } from "../../types/model";
import { AvailableModel } from "../../types/model";

export interface ChatMessage {
role: "system" | "user" | "assistant";
Expand Down Expand Up @@ -100,7 +100,6 @@ export abstract class LLMClient {
public type: "openai" | "anthropic" | "cerebras" | "groq" | (string & {});
public modelName: AvailableModel | (string & {});
public hasVision: boolean;
public clientOptions: ClientOptions;
public userProvidedInstructions?: string;

constructor(modelName: AvailableModel, userProvidedInstructions?: string) {
Expand Down
45 changes: 21 additions & 24 deletions lib/llm/LLMProvider.ts
Original file line number Diff line number Diff line change
@@ -1,8 +1,22 @@
import { AISDKCustomProvider, AISDKProvider } from "@/types/llm";
import {
UnsupportedAISDKModelProviderError,
UnsupportedModelError,
UnsupportedModelProviderError,
} from "@/types/stagehandErrors";
import { anthropic, createAnthropic } from "@ai-sdk/anthropic";
import { azure, createAzure } from "@ai-sdk/azure";
import { bedrock, createAmazonBedrock } from "@ai-sdk/amazon-bedrock";
import { cerebras, createCerebras } from "@ai-sdk/cerebras";
import { createDeepSeek, deepseek } from "@ai-sdk/deepseek";
import { createGoogleGenerativeAI, google } from "@ai-sdk/google";
import { createGroq, groq } from "@ai-sdk/groq";
import { createMistral, mistral } from "@ai-sdk/mistral";
import { createOpenAI, openai } from "@ai-sdk/openai";
import { createPerplexity, perplexity } from "@ai-sdk/perplexity";
import { createTogetherAI, togetherai } from "@ai-sdk/togetherai";
import { createXai, xai } from "@ai-sdk/xai";
import { ollama } from "ollama-ai-provider";
import { LogLine } from "../../types/log";
import {
AvailableModel,
Expand All @@ -17,19 +31,6 @@ import { GoogleClient } from "./GoogleClient";
import { GroqClient } from "./GroqClient";
import { LLMClient } from "./LLMClient";
import { OpenAIClient } from "./OpenAIClient";
import { openai, createOpenAI } from "@ai-sdk/openai";
import { anthropic, createAnthropic } from "@ai-sdk/anthropic";
import { google, createGoogleGenerativeAI } from "@ai-sdk/google";
import { xai, createXai } from "@ai-sdk/xai";
import { azure, createAzure } from "@ai-sdk/azure";
import { groq, createGroq } from "@ai-sdk/groq";
import { cerebras, createCerebras } from "@ai-sdk/cerebras";
import { togetherai, createTogetherAI } from "@ai-sdk/togetherai";
import { mistral, createMistral } from "@ai-sdk/mistral";
import { deepseek, createDeepSeek } from "@ai-sdk/deepseek";
import { perplexity, createPerplexity } from "@ai-sdk/perplexity";
import { ollama } from "ollama-ai-provider";
import { AISDKProvider, AISDKCustomProvider } from "@/types/llm";

const AISDKProviders: Record<string, AISDKProvider> = {
openai,
Expand All @@ -38,6 +39,7 @@ const AISDKProviders: Record<string, AISDKProvider> = {
xai,
azure,
groq,
bedrock,
cerebras,
togetherai,
mistral,
Expand All @@ -52,6 +54,7 @@ const AISDKProvidersWithAPIKey: Record<string, AISDKCustomProvider> = {
xai: createXai,
azure: createAzure,
groq: createGroq,
bedrock: createAmazonBedrock,
cerebras: createCerebras,
togetherai: createTogetherAI,
mistral: createMistral,
Expand Down Expand Up @@ -97,23 +100,18 @@ const modelToProviderMap: { [key in AvailableModel]: ModelProvider } = {
export function getAISDKLanguageModel(
subProvider: string,
subModelName: string,
apiKey?: string,
baseURL?: string,
modelClientOptions?: ClientOptions,
) {
if (apiKey) {
if (modelClientOptions && Object.keys(modelClientOptions).length > 0) {
const creator = AISDKProvidersWithAPIKey[subProvider];
if (!creator) {
throw new UnsupportedAISDKModelProviderError(
subProvider,
Object.keys(AISDKProvidersWithAPIKey),
);
}
// Create the provider instance with the API key and baseURL if provided
const providerConfig: { apiKey: string; baseURL?: string } = { apiKey };
if (baseURL) {
providerConfig.baseURL = baseURL;
}
const provider = creator(providerConfig);
// Create the provider instance with the custom configuration options
const provider = creator(modelClientOptions as Record<string, unknown>);
// Get the specific model from the provider
return provider(subModelName);
} else {
Expand Down Expand Up @@ -170,8 +168,7 @@ export class LLMProvider {
const languageModel = getAISDKLanguageModel(
subProvider,
subModelName,
clientOptions?.apiKey,
clientOptions?.baseURL,
clientOptions,
);

return new AISdkClient({
Expand Down
10 changes: 5 additions & 5 deletions lib/llm/OpenAIClient.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
import {
CreateChatCompletionResponseError,
StagehandError,
ZodSchemaValidationError,
} from "@/types/stagehandErrors";
import OpenAI, { ClientOptions } from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import {
Expand All @@ -21,11 +26,6 @@ import {
LLMClient,
LLMResponse,
} from "./LLMClient";
import {
CreateChatCompletionResponseError,
StagehandError,
ZodSchemaValidationError,
} from "@/types/stagehandErrors";

export class OpenAIClient extends LLMClient {
public type = "openai" as const;
Expand Down
Loading