Automated job search using Browserbase Stagehand and Streak CRM integration.
Job Scout searches multiple job boards (Hiring Cafe, WorkAtAStartup, and YC Jobs) using Browserbase's Stagehand SDK, extracts job listings that match configurable criteria, and creates Boxes in your Streak.com CRM pipeline with the job details.
- π Multi-source job search across Hiring Cafe, WorkAtAStartup, and YC Jobs
- π€ AI-powered browser automation using OpenAI GPT-4o-mini with Browserbase Stagehand
- π Streak CRM integration to track opportunities with field mapping
- βοΈ Configurable search criteria (keywords, location, salary, remote-only)
- π Session replay for debugging with Browserbase Inspector
- π« Deduplication to avoid duplicate entries
- π§ͺ Testing tools for API connectivity and box creation
- π§ Intelligent data extraction with natural language instructions
- Node.js 18+
- Browserbase API key
- Streak CRM API key and pipeline key
- OpenAI API key (for AI-powered browser automation)
- Clone the repository:
git clone <your-repo-url>
cd jobscout- Install dependencies:
npm install- Set up environment variables:
cp .env.example .env
# Edit .env with your API keys and field mappings
# Required: BROWSERBASE_API_KEY, STREAK_API_KEY, OPENAI_API_KEY- Test your setup:
# Verify Streak connectivity
npm run dev -- --verify-streak
# List available pipelines
npm run dev -- --list-pipelines
# Test box creation (dry run)
npm run dev -- --create-test-box --dry-run- Build the project:
npm run build# Basic search (when scrapers are implemented)
npm start -- --keywords="devrel,ai"
# Remote-only search with location filter
npm start -- --keywords="typescript,react" --remote-only --location="US"
# Dry run (no Streak boxes created)
npm start -- --keywords="python" --dry-runBROWSERBASE_API_KEY: Your Browserbase API key for browser automationSTREAK_API_KEY: Your Streak CRM API keyOPENAI_API_KEY: Your OpenAI API key for AI-powered browser automation
STREAK_PIPELINE_KEY: Your Streak pipeline keySTREAK_DEFAULT_STAGE_KEY: Default stage for new boxesSTREAK_FIELD_JOB_TITLE: Custom field key for job titleSTREAK_FIELD_SOURCE: Custom field key for sourceSTREAK_FIELD_LOCATION: Custom field key for locationSTREAK_FIELD_WEBSITE: Custom field key for website
DEFAULT_KEYWORDS: Comma-separated default keywordsDEFAULT_LOCATION: Default location preferenceDEFAULT_SALARY_MIN: Minimum salary requirement
-k, --keywords <keywords>: Comma-separated keywords to search for-r, --remote-only: Only search for remote jobs-l, --location <location>: Preferred location-s, --salary-min <amount>: Minimum salary requirement-d, --dry-run: Run without creating Streak boxes--list-pipelines: List available Streak pipelines--verify-streak: Test Streak API connectivity--create-test-box: Create a test box in Streak
Job Scout leverages OpenAI GPT-4o-mini through Browserbase's Stagehand SDK to provide intelligent, natural language-driven web automation:
- Natural Language Instructions: The AI agent receives human-like instructions like "Click the 'Apply Directly' button" or "Extract the company name from this job card"
- Intelligent Navigation: Automatically handles complex web interactions, form filling, and multi-tab management
- Adaptive Data Extraction: Uses AI to understand and extract structured data from dynamic web content
- Error Recovery: Intelligently handles UI changes and unexpected page layouts
- Smart Tab Management: Automatically switches to newly opened tabs to extract job application URLs
- Context-Aware Extraction: Understands job posting layouts and extracts relevant information
- Natural Language Processing: Processes job titles and descriptions to identify relevant opportunities
- Intelligent Filtering: Uses AI to match job postings against search criteria
// Example: AI-powered job data extraction
const jobData = await stagehand.extract({
schema: z.object({
title: z.string().describe("The job title"),
company: z.string().describe("The company name"),
url: z.string().describe("The job application URL")
}),
instruction: "Extract the job title, company name, and application URL from this job posting"
});Job Scout creates Streak boxes with intelligent field mapping:
- Box Name: Company name (e.g., "Mozilla", "Microsoft")
- Job Title: Full job title in custom field
- Source: Mapped to pipeline tags (HiringCafe β 9011, WorkAtAStartup β 9014, etc.)
- Location: Mapped to dropdown options (Remote β 9007, Austin β 9001, etc.)
- Website: Direct link to job posting
The system automatically maps job locations and sources to the correct dropdown/tag keys in your Streak pipeline.
# Development mode with hot reload
npm run dev
# Build for production
npm run build
# Clean build artifacts
npm run cleansrc/
βββ index.ts # Main CLI entry point
βββ types/ # TypeScript type definitions
βββ config/ # Configuration management with zod validation
βββ services/ # External service integrations
β βββ browserbase/ # Browserbase SDK client with session management
β βββ stagehand/ # AI-powered browser automation client
β βββ streak/ # Streak CRM API client with field mapping
βββ scrapers/ # Job board scrapers
β βββ hiringcafe/ # Hiring Cafe scraper with AI automation
βββ utils/ # Utility functions (logging, hashing, ID generation)
- Project foundation with TypeScript and CLI
- AI-powered browser automation with OpenAI GPT-4o-mini and Browserbase Stagehand
- Streak API v2 integration with field mapping
- Job posting data model and types
- Configuration validation with zod
- Testing tools for API connectivity
- Hiring Cafe scraper with intelligent tab management and URL extraction
- End-to-end workflow from job search to Streak box creation
- Real browser automation testing (plan limit reached)
- Additional job source implementation
- Implement WorkAtAStartup and YC Jobs scrapers
- Add deduplication and local caching
- Enhanced logging with screenshots and HTML snippets
- Multi-source orchestration and parallel processing
- OpenAI GPT-4o-mini: AI model for natural language browser automation
- Browserbase Stagehand: AI-powered web automation SDK
- Natural Language Processing: Intelligent data extraction and filtering
- Node.js 18+: Runtime environment
- TypeScript: Type-safe development with strict mode
- Zod: Runtime type validation and schema definition
- Commander.js: CLI framework with comprehensive options
- Browserbase API: Headless browser automation and session management
- Streak CRM API v2: Customer relationship management integration
- OpenAI API: AI model access for intelligent automation
- Structured Job Data: Normalized job posting schema
- Field Mapping: Intelligent CRM field mapping with dropdown/tag support
- Session Management: Browser session lifecycle and replay capabilities
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
MIT