|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Before completing any task |
| 6 | + |
| 7 | +Always run these commands before committing or saying a task is done: |
| 8 | + |
| 9 | +```bash |
| 10 | +bun run format |
| 11 | +bun run lint |
| 12 | +bunx tsc --noEmit |
| 13 | +bun run build |
| 14 | +bun test |
| 15 | +``` |
| 16 | + |
| 17 | +No exceptions. |
| 18 | + |
| 19 | +## Project Overview |
| 20 | + |
| 21 | +**scrapegraph-js** is the official JavaScript/TypeScript SDK for the ScrapeGraph AI API. It provides a TypeScript client for intelligent web scraping powered by AI. |
| 22 | + |
| 23 | +## Repository Structure |
| 24 | + |
| 25 | +``` |
| 26 | +scrapegraph-js/ |
| 27 | +├── src/ # TypeScript SDK source |
| 28 | +├── tests/ # Test suite |
| 29 | +├── examples/ # Usage examples |
| 30 | +├── scripts/ # Development utilities |
| 31 | +└── .github/workflows/ # CI/CD |
| 32 | +``` |
| 33 | + |
| 34 | +## Tech Stack |
| 35 | + |
| 36 | +- **Language**: TypeScript (Node.js 22+) |
| 37 | +- **Runtime**: Bun |
| 38 | +- **Core Dependencies**: zod (validation) |
| 39 | +- **Testing**: Bun test |
| 40 | +- **Code Quality**: Biome (lint + format) |
| 41 | +- **Build**: tsup |
| 42 | + |
| 43 | +## Commands |
| 44 | + |
| 45 | +```bash |
| 46 | +# Install |
| 47 | +bun install |
| 48 | + |
| 49 | +# Dev (watch mode) |
| 50 | +bun run dev |
| 51 | + |
| 52 | +# Test |
| 53 | +bun test # unit tests |
| 54 | +bun run test:integration # integration tests |
| 55 | + |
| 56 | +# Format |
| 57 | +bun run format |
| 58 | + |
| 59 | +# Lint |
| 60 | +bun run lint |
| 61 | + |
| 62 | +# Type check |
| 63 | +bunx tsc --noEmit |
| 64 | + |
| 65 | +# Build |
| 66 | +bun run build |
| 67 | + |
| 68 | +# Playground (loads .env) |
| 69 | +bun run playground |
| 70 | +``` |
| 71 | + |
| 72 | +## Architecture |
| 73 | + |
| 74 | +**Core Components:** |
| 75 | + |
| 76 | +1. **Client** (`src/scrapegraphai.ts`): |
| 77 | + - `ScrapeGraphAI()` - Factory function returning namespaced client |
| 78 | + - Handles all API communication |
| 79 | + |
| 80 | +2. **Types** (`src/types.ts`): |
| 81 | + - Request/response types for all endpoints |
| 82 | + - Zod schema inference |
| 83 | + |
| 84 | +3. **Schemas** (`src/schemas.ts`): |
| 85 | + - Zod validation schemas |
| 86 | + |
| 87 | +4. **Config** (`src/env.ts`): |
| 88 | + - Environment variable handling |
| 89 | + |
| 90 | +## API Methods |
| 91 | + |
| 92 | +| Method | Purpose | |
| 93 | +|--------|---------| |
| 94 | +| `sgai.scrape()` | AI data extraction from URL | |
| 95 | +| `sgai.extract()` | Extract from raw HTML/text | |
| 96 | +| `sgai.search()` | Web search + extraction | |
| 97 | +| `sgai.crawl.start()` | Start crawl job | |
| 98 | +| `sgai.crawl.get()` | Get crawl status | |
| 99 | +| `sgai.monitor.create()` | Create monitoring job | |
| 100 | +| `sgai.monitor.get()` | Get monitor status | |
| 101 | +| `sgai.monitor.update()` | Update monitor config | |
| 102 | +| `sgai.monitor.delete()` | Delete monitor | |
| 103 | +| `sgai.credits()` | Check API credits | |
| 104 | +| `sgai.healthy()` | Health check | |
| 105 | +| `sgai.history.list()` | List request history | |
| 106 | +| `sgai.history.get()` | Get specific request | |
| 107 | + |
| 108 | +## Adding New Endpoint |
| 109 | + |
| 110 | +1. Add types in `src/types.ts` |
| 111 | +2. Add Zod schema in `src/schemas.ts` |
| 112 | +3. Add function in `src/scrapegraphai.ts` |
| 113 | +4. Wire into `ScrapeGraphAI()` client object |
| 114 | +5. Export types in `src/index.ts` |
| 115 | +6. Add tests in `tests/` |
| 116 | +7. Add example in `examples/` |
| 117 | + |
| 118 | +## Environment Variables |
| 119 | + |
| 120 | +- `SGAI_API_KEY` - API key for authentication |
| 121 | +- `SGAI_DEBUG` - Enable debug logging (optional) |
| 122 | + |
| 123 | +## Usage |
| 124 | + |
| 125 | +```typescript |
| 126 | +import { ScrapeGraphAI } from "scrapegraph-js"; |
| 127 | + |
| 128 | +const sgai = ScrapeGraphAI(); // reads SGAI_API_KEY from env |
| 129 | + |
| 130 | +const res = await sgai.scrape({ |
| 131 | + url: "https://example.com", |
| 132 | + prompt: "Extract the main heading", |
| 133 | +}); |
| 134 | + |
| 135 | +if (res.status === "success") { |
| 136 | + console.log(res.data?.result); |
| 137 | +} |
| 138 | +``` |
| 139 | + |
| 140 | +## Links |
| 141 | + |
| 142 | +- [API Docs](https://docs.scrapegraphai.com) |
| 143 | +- [npm](https://www.npmjs.com/package/scrapegraph-js) |
0 commit comments