browserbase
diff --git a/‎examples/integrations/cartesia/.cartesia/config.toml‎
Lines changed: 1 addition & 0 deletions b/‎examples/integrations/cartesia/.cartesia/config.toml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎examples/integrations/cartesia/.env.example‎
Lines changed: 11 additions & 0 deletions b/‎examples/integrations/cartesia/.env.example‎
Lines changed: 11 additions & 0 deletions
diff --git a/‎examples/integrations/cartesia/.gitignore‎
Lines changed: 39 additions & 0 deletions b/‎examples/integrations/cartesia/.gitignore‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎examples/integrations/cartesia/README.md‎
Lines changed: 111 additions & 0 deletions b/‎examples/integrations/cartesia/README.md‎
Lines changed: 111 additions & 0 deletions
diff --git a/‎examples/integrations/cartesia/__init__.py‎ b/‎examples/integrations/cartesia/__init__.py‎
diff --git a/‎examples/integrations/cartesia/cartesia.toml‎
Lines changed: 8 additions & 0 deletions b/‎examples/integrations/cartesia/cartesia.toml‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎examples/integrations/cartesia/config.py‎
Lines changed: 26 additions & 0 deletions b/‎examples/integrations/cartesia/config.py‎
Lines changed: 26 additions & 0 deletions
diff --git a/‎examples/integrations/cartesia/config.toml‎
Lines changed: 1 addition & 0 deletions b/‎examples/integrations/cartesia/config.toml‎
Lines changed: 1 addition & 0 deletions
@@ -0,0 +1 @@
+agent-id = 'agent_NKsQKSxugbsoA3ByZrJVQY'
@@ -0,0 +1,11 @@
+# Gemini API Key for language model
+GEMINI_API_KEY=your_gemini_api_key_here
+
+# Optional: Browserbase API credentials for cloud browser automation
+# If not set, will use local browser
+BROWSERBASE_API_KEY=your_browserbase_api_key_here
+BROWSERBASE_PROJECT_ID=your_browserbase_project_id_here
+
+# Optional: Model configuration
+MODEL_NAME=google/gemini-2.0-flash-exp
+MODEL_API_KEY=your_model_api_key_here
@@ -0,0 +1,39 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*.pyd
+.Python
+
+# Virtual environments
+.env
+.venv/
+venv/
+env/
+
+virtualenv/
+
+# Conda environments
+conda-env/
+envs/
+.conda/
+conda-meta/
+
+# uv environments (in addition to uv.lock at top)
+uv.lock
+.python-version
+
+# Python package managers
+poetry.lock
+Pipfile.lock
+pip-log.txt
+
+# pyenv
+.pyenv/
+
+# Distribution / packaging
+*.egg-info/
+dist/
+build/
+
+# Editor / OS files
+.DS_Store
@@ -0,0 +1,111 @@
+# Voice Agent with Real-time Web Form Filling
+
+This project demonstrates an advanced voice agent that conducts phone questionnaires while automatically filling out web forms in real-time using Stagehand browser automation.
+
+## Features
+
+- **Voice Conversations**: Natural voice interactions using Cartesia Line
+- **Real-time Form Filling**: Automatically fills web forms as answers are collected
+- **Browser Automation**: Uses Stagehand AI to interact with any web form
+- **Intelligent Mapping**: AI-powered mapping of voice answers to form fields
+- **Async Processing**: Non-blocking form filling maintains conversation flow
+- **Auto-submission**: Submits forms automatically when complete
+
+## Architecture
+
+```
+Voice Call (Cartesia) → Form Filling Node → Records Answer
+                              ↓
+                     Stagehand Browser API
+                              ↓
+                     Fills Web Form Field
+                              ↓
+                     Continues Conversation
+                              ↓
+                     Submits Form on Completion
+```
+
+## Setup
+
+1. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
+
+2. Set up environment variables:
+```bash
+cp .env.example .env
+# Add your GEMINI_API_KEY
+```
+
+3. Run the agent:
+```bash
+python main.py
+```
+
+## Components
+
+### StagehandFormFiller
+- Manages browser automation
+- Opens and controls web forms
+- Maps conversation data to form fields
+- Handles form submission
+
+### FormFillingNode
+- Voice-optimized reasoning node
+- Integrates Stagehand browser automation
+- Manages async form filling during conversation
+- Provides status updates
+
+### FormFieldMapping
+- Maps YAML questions to web form fields
+- Transforms voice answers to form-compatible formats
+- Handles different field types (text, select, checkbox, etc.)
+
+## Configuration
+
+The system can be configured through:
+
+- `form.yaml`: Define questionnaire structure
+- `FORM_URL`: Target web form to fill
+- `headless`: Run browser in background (True) or visible (False)
+- `enable_browser`: Toggle browser automation on/off
+
+## Example Flow
+
+1. User calls the voice agent
+2. Agent asks: "What type of voice agent are you building?"
+3. User responds: "A customer service agent"
+4. System:
+   - Records the answer
+   - Opens browser to form (if not already open)
+   - Fills "Customer Service" in the role selection field
+   - Takes screenshot for debugging
+5. Agent asks next question
+6. Process continues until all questions answered
+7. Form is automatically submitted
+
+## Advanced Features
+
+- **Background Processing**: Form filling happens asynchronously
+- **Error Recovery**: Continues conversation even if form filling fails
+- **Progress Tracking**: Monitor form completion status
+- **Screenshot Debugging**: Captures screenshots after each field
+- **Flexible Mapping**: AI interprets answers for different field types
+
+## Testing
+
+Test with different scenarios:
+- Complete questionnaire flow
+- Interruptions and corrections
+- Various answer formats
+- Multi-page forms
+- Form validation errors
+
+## Production Considerations
+
+- Set `headless=True` for production
+- Configure proper error logging
+- Add retry logic for form submission
+- Implement form validation checks
+- Consider rate limiting for API calls
@@ -0,0 +1,8 @@
+[app]
+name = "form-filling"
+
+[build]
+cmd = "echo 'No build cmd specified'"
+
+[run]
+cmd = "echo 'No run cmd specified'"
@@ -0,0 +1,26 @@
+import os
+
+DEFAULT_MODEL_ID = os.getenv("MODEL_ID", "gemini-2.5-flash")
+
+DEFAULT_TEMPERATURE = 0.7
+SYSTEM_PROMPT = """
+### You and your role
+You are a friendly assistant conducting a questionnaire.
+Be professional but conversational. Confirm answers when appropriate.
+If a user's answer is unclear, ask for clarification.
+For sensitive information, be especially tactful and professional.
+
+IMPORTANT: When you receive a clear answer from the user, use the record_answer tool to record their response.
+
+### Your tone
+When having a conversation, you should:
+- Always polite and respectful, even when users are challenging
+- Concise and brief but never curt. Keep your responses to 1-2 sentences and less than 35 words
+- When asking a question, be sure to ask in a short and concise manner
+- Only ask one question at a time
+
+If the user is rude, or curses, respond with exceptional politeness and genuine curiosity.
+You should always be polite.
+
+Remember, you're on the phone, so do not use emojis or abbreviations. Spell out units and dates.
+"""
@@ -0,0 +1 @@
+agent-id = 'agent_NKsQKSxugbsoA3ByZrJVQY'
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	`+agent-id = 'agent_NKsQKSxugbsoA3ByZrJVQY'`