11# ScrapeGraphAI SDK Documentation
22
3- Welcome to the ScrapeGraphAI SDK documentation hub. This directory contains comprehensive documentation for understanding, developing, and maintaining the official Python and JavaScript SDKs for the ScrapeGraph AI API.
3+ Welcome to the ScrapeGraphAI SDK documentation hub. This directory contains comprehensive documentation for understanding, developing, and maintaining the official Python SDK for the ScrapeGraph AI API.
44
55## 📚 Available Documentation
66
77### System Documentation (` system/ ` )
88
99#### [ Project Architecture] ( ./system/project_architecture.md )
1010Complete SDK architecture documentation including:
11- - ** Monorepo Structure** - How Python and JavaScript SDKs are organized
11+ - ** Repository Structure** - How the Python SDK is organized
1212- ** Python SDK Architecture** - Client structure, async/sync support, models
13- - ** JavaScript SDK Architecture** - Function-based API, async design
14- - ** API Endpoints Coverage** - All supported endpoints across SDKs
13+ - ** API Endpoints Coverage** - All supported endpoints
1514- ** Authentication** - API key management and security
1615- ** Testing Strategy** - Unit tests, integration tests, CI/CD
1716- ** Release Process** - Semantic versioning and publishing
@@ -33,11 +32,9 @@ Complete SDK architecture documentation including:
33321 . ** Read First:**
3433 - [ Main README] ( ../README.md ) - Project overview and features
3534 - [ Python SDK README] ( ../scrapegraph-py/README.md ) - Python SDK guide
36- - [ JavaScript SDK README] ( ../scrapegraph-js/README.md ) - JavaScript SDK guide
3735
38- 2 . ** Choose Your SDK :**
36+ 2 . ** Set Up Development Environment :**
3937
40- ** Python SDK:**
4138 ``` bash
4239 cd scrapegraph-py
4340
@@ -52,35 +49,16 @@ Complete SDK architecture documentation including:
5249 pre-commit install
5350 ```
5451
55- ** JavaScript SDK:**
56- ``` bash
57- cd scrapegraph-js
58-
59- # Install dependencies
60- npm install
61-
62- # Run tests
63- npm test
64- ```
65-
66523 . ** Run Tests:**
6753
68- ** Python:**
6954 ``` bash
7055 cd scrapegraph-py
7156 pytest tests/ -v
7257 ```
7358
74- ** JavaScript:**
75- ``` bash
76- cd scrapegraph-js
77- npm test
78- ```
79-
80594 . ** Explore the Codebase:**
8160 - ** Python** : ` scrapegraph_py/client.py ` - Sync client, ` scrapegraph_py/async_client.py ` - Async client
82- - ** JavaScript** : ` src/ ` directory - Individual endpoint modules
83- - ** Examples** : ` scrapegraph-py/examples/ ` and ` scrapegraph-js/examples/ `
61+ - ** Examples** : ` scrapegraph-py/examples/ `
8462
8563---
8664
@@ -90,26 +68,21 @@ Complete SDK architecture documentation including:
9068
9169** ...how to add a new endpoint:**
9270- Read: Python SDK - ` scrapegraph_py/client.py ` , ` scrapegraph_py/async_client.py `
93- - Read: JavaScript SDK - Create new file in ` src/ `
9471- Examples: Look at existing endpoint implementations
9572
9673** ...how authentication works:**
9774- Read: Python SDK - ` scrapegraph_py/client.py ` (initialization)
98- - Read: JavaScript SDK - Each function accepts ` apiKey ` parameter
99- - Both SDKs support ` SGAI_API_KEY ` environment variable
75+ - The SDK supports ` SGAI_API_KEY ` environment variable
10076
10177** ...how error handling works:**
10278- Read: Python SDK - ` scrapegraph_py/exceptions.py `
103- - Read: JavaScript SDK - Try/catch blocks in each endpoint
10479
10580** ...how testing works:**
10681- Read: Python SDK - ` tests/ ` directory, ` pytest.ini `
107- - Read: JavaScript SDK - ` test/ ` directory
10882- Run: Follow test commands in README
10983
11084** ...how releases work:**
11185- Read: Python SDK - ` .releaserc.yml ` (semantic-release config)
112- - Read: JavaScript SDK - ` .releaserc ` (semantic-release config)
11386- GitHub Actions: ` .github/workflows/ ` for automated releases
11487
11588---
@@ -118,7 +91,6 @@ Complete SDK architecture documentation including:
11891
11992### Running Tests
12093
121- ** Python SDK:**
12294``` bash
12395cd scrapegraph-py
12496
@@ -132,20 +104,8 @@ pytest tests/test_smartscraper.py -v
132104pytest --cov=scrapegraph_py --cov-report=html tests/
133105```
134106
135- ** JavaScript SDK:**
136- ``` bash
137- cd scrapegraph-js
138-
139- # Run all tests
140- npm test
141-
142- # Run specific test
143- node test/test_smartscraper.js
144- ```
145-
146107### Code Quality
147108
148- ** Python SDK:**
149109``` bash
150110cd scrapegraph-py
151111
@@ -166,20 +126,8 @@ make format
166126make lint
167127```
168128
169- ** JavaScript SDK:**
170- ``` bash
171- cd scrapegraph-js
172-
173- # Format code
174- npm run format
175-
176- # Lint code
177- npm run lint
178- ```
179-
180129### Building & Publishing
181130
182- ** Python SDK:**
183131``` bash
184132cd scrapegraph-py
185133
@@ -190,35 +138,23 @@ python -m build
190138twine upload dist/*
191139```
192140
193- ** JavaScript SDK:**
194- ``` bash
195- cd scrapegraph-js
196-
197- # Build package (if needed)
198- npm run build
199-
200- # Publish to npm (automated via GitHub Actions)
201- npm publish
202- ```
203-
204141---
205142
206143## 📊 SDK Endpoint Reference
207144
208- Both SDKs support the following endpoints:
209-
210- | Endpoint | Python SDK | JavaScript SDK | Purpose |
211- | ----------| -----------| ----------------| ---------|
212- | SmartScraper | ✅ | ✅ | AI-powered data extraction |
213- | SearchScraper | ✅ | ✅ | Multi-website search extraction |
214- | Markdownify | ✅ | ✅ | HTML to Markdown conversion |
215- | Sitemap | ❌ | ✅ | Sitemap URL extraction |
216- | SmartCrawler | ✅ | ✅ | Sitemap generation & crawling |
217- | AgenticScraper | ✅ | ✅ | Browser automation |
218- | Scrape | ✅ | ✅ | Basic HTML extraction |
219- | Scheduled Jobs | ✅ | ✅ | Cron-based job scheduling |
220- | Credits | ✅ | ✅ | Credit balance management |
221- | Feedback | ✅ | ✅ | Rating and feedback |
145+ The SDK supports the following endpoints:
146+
147+ | Endpoint | Python SDK | Purpose |
148+ | ----------| -----------| ---------|
149+ | SmartScraper | ✅ | AI-powered data extraction |
150+ | SearchScraper | ✅ | Multi-website search extraction |
151+ | Markdownify | ✅ | HTML to Markdown conversion |
152+ | SmartCrawler | ✅ | Sitemap generation & crawling |
153+ | AgenticScraper | ✅ | Browser automation |
154+ | Scrape | ✅ | Basic HTML extraction |
155+ | Scheduled Jobs | ✅ | Cron-based job scheduling |
156+ | Credits | ✅ | Credit balance management |
157+ | Feedback | ✅ | Rating and feedback |
222158
223159---
224160
@@ -251,31 +187,6 @@ Both SDKs support the following endpoints:
251187- ` Makefile ` - Common development tasks
252188- ` .releaserc.yml ` - Semantic-release configuration
253189
254- ### JavaScript SDK
255-
256- ** Entry Points:**
257- - ` index.js ` - Main package entry
258- - ` src/ ` - Individual endpoint modules
259- - ` smartScraper.js `
260- - ` searchScraper.js `
261- - ` crawl.js `
262- - ` markdownify.js `
263- - ` sitemap.js `
264- - ` agenticScraper.js `
265- - ` scrape.js `
266- - ` scheduledJobs.js `
267- - ` credits.js `
268- - ` feedback.js `
269- - ` schema.js `
270-
271- ** Utilities:**
272- - ` src/utils/ ` - Helper functions
273-
274- ** Configuration:**
275- - ` package.json ` - Package metadata and scripts
276- - ` eslint.config.js ` - ESLint configuration
277- - ` .prettierrc.json ` - Prettier configuration
278-
279190---
280191
281192## 🧪 Testing
@@ -292,16 +203,6 @@ scrapegraph-py/tests/
292203└── conftest.py # Pytest fixtures
293204```
294205
295- ### JavaScript SDK Test Structure
296-
297- ```
298- scrapegraph-js/test/
299- ├── test_smartscraper.js
300- ├── test_searchscraper.js
301- ├── test_crawl.js
302- └── test_*.js
303- ```
304-
305206### Writing Tests
306207
307208** Python Example:**
@@ -318,24 +219,6 @@ def test_smartscraper_basic():
318219 assert response.request_id is not None
319220```
320221
321- ** JavaScript Example:**
322- ``` javascript
323- import { smartScraper } from ' scrapegraph-js' ;
324-
325- (async () => {
326- try {
327- const response = await smartScraper (
328- ' test-key' ,
329- ' https://example.com' ,
330- ' Extract title'
331- );
332- console .log (' Success:' , response .result );
333- } catch (error) {
334- console .error (' Error:' , error);
335- }
336- })();
337- ```
338-
339222---
340223
341224## 🚨 Troubleshooting
@@ -352,14 +235,6 @@ import { smartScraper } from 'scrapegraph-js';
352235 uv sync
353236 ```
354237
355- ** Issue: Module not found in JavaScript SDK**
356- - ** Cause:** Dependencies not installed
357- - ** Solution:**
358- ``` bash
359- cd scrapegraph-js
360- npm install
361- ```
362-
363238** Issue: API key errors**
364239- ** Cause:** Invalid or missing API key
365240- ** Solution:**
@@ -382,18 +257,14 @@ import { smartScraper } from 'scrapegraph-js';
382257### Official Docs
383258- [ ScrapeGraph AI API Documentation] ( https://docs.scrapegraphai.com )
384259- [ Python SDK Documentation] ( https://docs.scrapegraphai.com/sdks/python )
385- - [ JavaScript SDK Documentation] ( https://docs.scrapegraphai.com/sdks/javascript )
386260
387261### Package Repositories
388262- [ PyPI - scrapegraph-py] ( https://pypi.org/project/scrapegraph-py/ )
389- - [ npm - scrapegraph-js] ( https://www.npmjs.com/package/scrapegraph-js )
390263
391264### Development Tools
392265- [ pytest Documentation] ( https://docs.pytest.org/ )
393266- [ Pydantic Documentation] ( https://docs.pydantic.dev/ )
394267- [ uv Documentation] ( https://docs.astral.sh/uv/ )
395- - [ ESLint Documentation] ( https://eslint.org/docs/latest/ )
396- - [ Prettier Documentation] ( https://prettier.io/docs/en/ )
397268
398269---
399270
@@ -426,12 +297,6 @@ import { smartScraper } from 'scrapegraph-js';
426297- ** Type hints** - Use Pydantic models and type annotations
427298- ** Docstrings** - Document public functions and classes
428299
429- ** JavaScript SDK:**
430- - ** Prettier** - Code formatting
431- - ** ESLint** - Linting
432- - ** JSDoc** - Function documentation
433- - ** Async/await** - Use promises for all async operations
434-
435300### Commit Message Format
436301
437302Follow [ Conventional Commits] ( https://www.conventionalcommits.org/ ) :
@@ -479,7 +344,7 @@ This enables automated semantic versioning and changelog generation.
479344
480345## 📅 Release Process
481346
482- Both SDKs use ** semantic-release** for automated versioning and publishing:
347+ The SDK uses ** semantic-release** for automated versioning and publishing:
483348
484349### Release Workflow
485350
@@ -488,10 +353,10 @@ Both SDKs use **semantic-release** for automated versioning and publishing:
4883533 . ** Merge to main** - Pull request approved and merged
4893544 . ** Automated release** - GitHub Actions:
490355 - Determines version bump (major/minor/patch)
491- - Updates version in ` package.json ` / ` pyproject.toml `
356+ - Updates version in ` pyproject.toml `
492357 - Generates CHANGELOG.md
493358 - Creates GitHub release
494- - Publishes to npm / PyPI
359+ - Publishes to PyPI
495360
496361### Version Bumping Rules
497362
@@ -505,7 +370,6 @@ Both SDKs use **semantic-release** for automated versioning and publishing:
505370
506371- [ Main README] ( ../README.md ) - Project overview
507372- [ Python SDK README] ( ../scrapegraph-py/README.md ) - Python guide
508- - [ JavaScript SDK README] ( ../scrapegraph-js/README.md ) - JavaScript guide
509373- [ Cookbook] ( ../cookbook/ ) - Usage examples
510374- [ API Documentation] ( https://docs.scrapegraphai.com ) - Full API docs
511375
@@ -515,7 +379,7 @@ Both SDKs use **semantic-release** for automated versioning and publishing:
515379
516380For questions or issues:
5173811 . Check this documentation first
518- 2 . Review SDK-specific READMEs
382+ 2 . Review SDK-specific README
5193833 . Search existing GitHub issues
5203844 . Create a new issue with:
521385 - SDK version
0 commit comments