|
| 1 | +# Browserbase + Trigger.dev Integration |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Trigger.dev is a background job framework that enables you to create, run, and monitor background tasks with built-in retry logic, scheduling, and observability. This integration showcases multiple use cases: |
| 6 | + |
| 7 | +- **PDF Processing**: Convert PDFs to images and upload to cloud storage |
| 8 | +- **Web Scraping**: Extract data from websites using Puppeteer and Browserbase |
| 9 | +- **Document Generation**: Create PDFs from React components |
| 10 | +- **Email Automation**: Scheduled tasks with email notifications |
| 11 | +- **Task Hierarchies**: Complex workflows with parent-child task relationships |
| 12 | + |
| 13 | +## Task Examples |
| 14 | + |
| 15 | +### 1. PDF to Image Conversion (`pdf-to-image.tsx`) |
| 16 | +Converts PDF documents to PNG images using MuPDF and uploads them to Cloudflare R2 storage. |
| 17 | + |
| 18 | +**Features:** |
| 19 | +- Downloads PDF from URL |
| 20 | +- Converts each page to PNG using `mutool` |
| 21 | +- Uploads images to R2 bucket |
| 22 | +- Returns array of image URLs |
| 23 | +- Automatic cleanup of temporary files |
| 24 | + |
| 25 | +### 2. Puppeteer Web Scraping (`puppeteer-*.tsx`) |
| 26 | + |
| 27 | +#### Basic Page Title Extraction |
| 28 | +Simple example that launches Puppeteer, navigates to Google, and logs the page title. |
| 29 | + |
| 30 | +#### Scraping with Browserbase Proxy |
| 31 | +Uses Browserbase's cloud browser infrastructure to scrape data through a proxy: |
| 32 | +- Connects to Browserbase WebSocket endpoint |
| 33 | +- Scrapes GitHub star count from trigger.dev website |
| 34 | +- Handles errors gracefully |
| 35 | + |
| 36 | +#### Webpage to PDF Generation |
| 37 | +Converts web pages to PDF documents: |
| 38 | +- Navigates to target URL |
| 39 | +- Generates PDF from webpage |
| 40 | +- Uploads PDF to cloud storage |
| 41 | + |
| 42 | +### 3. React PDF Generation (`react-pdf.tsx`) |
| 43 | +Creates PDF documents using React components and @react-pdf/renderer: |
| 44 | +- Accepts text payload |
| 45 | +- Renders PDF using React components |
| 46 | +- Uploads generated PDF to cloud storage |
| 47 | +- Returns PDF URL |
| 48 | + |
| 49 | +### 4. Hacker News Summarization (`summarize-hn.tsx`) |
| 50 | +Scheduled task that runs weekdays at 9 AM (London time): |
| 51 | + |
| 52 | +**Workflow:** |
| 53 | +1. **Scrapes Hacker News** - Gets top 3 articles |
| 54 | +2. **Batch Processing** - Triggers child tasks for each article |
| 55 | +3. **Content Extraction** - Scrapes full article content |
| 56 | +4. **AI Summarization** - Uses OpenAI GPT-4 to create summaries |
| 57 | +5. **Email Delivery** - Sends formatted email with summaries |
| 58 | + |
| 59 | +**Features:** |
| 60 | +- Scheduled execution with cron syntax |
| 61 | +- Batch task processing with `batchTriggerAndWait` |
| 62 | +- Retry logic with exponential backoff |
| 63 | +- Request interception to optimize scraping |
| 64 | +- Email templates using React Email |
| 65 | + |
| 66 | +### 5. Task Hierarchy (`taskHierarchy.ts`) |
| 67 | +Demonstrates complex task workflows with parent-child relationships: |
| 68 | +- **Root Task** → **Child Task** → **Grandchild Task** → **Great-grandchild Task** |
| 69 | +- Shows both synchronous (`triggerAndWait`) and asynchronous (`trigger`) patterns |
| 70 | +- Batch processing capabilities |
| 71 | +- Run hierarchy logging and visualization |
| 72 | + |
| 73 | +## Configuration |
| 74 | + |
| 75 | +### Trigger.dev Config (`trigger.config.ts`) |
| 76 | +```typescript |
| 77 | + |
| 78 | +export default defineConfig({ |
| 79 | + project: "proj_ljbidlufugyxuhjxzkyy", // your Trigger Project ID |
| 80 | + logLevel: "log", |
| 81 | + retries: { |
| 82 | + enabledInDev: true, |
| 83 | + default: { |
| 84 | + maxAttempts: 3, |
| 85 | + minTimeoutInMs: 1000, |
| 86 | + maxTimeoutInMs: 10000, |
| 87 | + factor: 2, |
| 88 | + randomize: true, |
| 89 | + }, |
| 90 | + }, |
| 91 | + build: { |
| 92 | + extensions: [ |
| 93 | + aptGet({ packages: ["mupdf-tools", "curl"] }), |
| 94 | + puppeteer(), |
| 95 | + ], |
| 96 | + }, |
| 97 | +}); |
| 98 | +``` |
| 99 | + |
| 100 | +**Key Features:** |
| 101 | +- **System Dependencies**: Installs MuPDF tools and curl via `aptGet` extension |
| 102 | +- **Puppeteer Extension**: Automatically sets up Puppeteer with Chrome |
| 103 | +- **Retry Configuration**: Global retry settings with exponential backoff |
| 104 | +- **Development Mode**: Retries enabled in development environment |
| 105 | + |
| 106 | +## Dependencies |
| 107 | + |
| 108 | +Key packages used in this integration: |
| 109 | + |
| 110 | +```json |
| 111 | +{ |
| 112 | + "@trigger.dev/sdk": "3.0.13", |
| 113 | + "@trigger.dev/build": "3.0.13", |
| 114 | + "@aws-sdk/client-s3": "^3.651.0", |
| 115 | + "@react-pdf/renderer": "^3.4.4", |
| 116 | + "@react-email/components": "^0.1.0", |
| 117 | + "puppeteer": "^23.4.0", |
| 118 | + "puppeteer-core": "^23.5.3", |
| 119 | + "openai": "^4.67.3", |
| 120 | + "resend": "^4.0.0" |
| 121 | +} |
| 122 | +``` |
| 123 | + |
| 124 | +## Environment Variables |
| 125 | + |
| 126 | +Create a `.env.local` file in your project root with the following variables: |
| 127 | + |
| 128 | +```bash |
| 129 | +# Trigger.dev Configuration |
| 130 | +TRIGGER_SECRET_KEY=your-trigger-secret-key |
| 131 | + |
| 132 | +# Browserbase Configuration (for web scraping) |
| 133 | +BROWSERBASE_API_KEY=your-browserbase-api-key |
| 134 | +BROWSERBASE_PROJECT_ID=your-browserbase-project-id |
| 135 | + |
| 136 | +# Cloudflare R2 Storage Configuration (for file uploads) |
| 137 | +S3_ENDPOINT=https://your-account-id.r2.cloudflarestorage.com |
| 138 | +R2_ACCESS_KEY_ID=your-r2-access-key-id |
| 139 | +R2_SECRET_ACCESS_KEY=your-r2-secret-access-key |
| 140 | +S3_BUCKET=your-r2-bucket-name |
| 141 | + |
| 142 | +# OpenAI Configuration (for AI summarization) |
| 143 | +OPENAI_API_KEY=your-openai-api-key |
| 144 | + |
| 145 | +# Resend Configuration (for email delivery) |
| 146 | +RESEND_API_KEY=your-resend-api-key |
| 147 | +``` |
| 148 | + |
| 149 | +## Getting Started |
| 150 | + |
| 151 | +1. **Install dependencies:** |
| 152 | +```bash |
| 153 | +npm install |
| 154 | +``` |
| 155 | + |
| 156 | +2. **Set up environment variables:** |
| 157 | +Copy the environment variables above into a `.env.local` file and fill in your actual values. |
| 158 | + |
| 159 | +3. **Start development server:** |
| 160 | +```bash |
| 161 | +npm run dev |
| 162 | +``` |
| 163 | + |
| 164 | +4. **Deploy to Trigger.dev:** |
| 165 | +```bash |
| 166 | +npx trigger.dev@latest deploy |
| 167 | +``` |
| 168 | + |
| 169 | +## Machine Presets |
| 170 | + |
| 171 | +Tasks can specify machine requirements: |
| 172 | +```typescript |
| 173 | +export const puppeteerBasicTask = task({ |
| 174 | + id: "puppeteer-log-title", |
| 175 | + machine: { |
| 176 | + preset: "large-1x", |
| 177 | + }, |
| 178 | + // ... |
| 179 | +}); |
| 180 | +``` |
| 181 | + |
| 182 | +Available presets provide different CPU/memory configurations for resource-intensive tasks. |
| 183 | + |
| 184 | +## Error Handling & Retries |
| 185 | + |
| 186 | +All tasks include comprehensive error handling: |
| 187 | +- **Automatic retries** with exponential backoff |
| 188 | +- **Resource cleanup** (browser instances, temporary files) |
| 189 | +- **Detailed logging** for debugging |
| 190 | +- **Graceful failure** handling |
| 191 | + |
| 192 | +## Integration Services |
| 193 | + |
| 194 | +This example integrates with several external services: |
| 195 | +- **Browserbase**: Cloud browser infrastructure for web scraping |
| 196 | +- **Cloudflare R2**: Object storage for files |
| 197 | +- **OpenAI**: AI-powered content summarization |
| 198 | +- **Resend**: Email delivery service |
| 199 | +- **React Email**: Email template rendering |
| 200 | + |
| 201 | +## Use Cases |
| 202 | + |
| 203 | +Perfect for: |
| 204 | +- **Document processing workflows** |
| 205 | +- **Web scraping and data extraction** |
| 206 | +- **Automated content generation** |
| 207 | +- **Scheduled reporting and notifications** |
| 208 | +- **Complex multi-step background processes** |
0 commit comments