|
| 1 | +# ScrapeGraph Python SDK |
| 2 | + |
| 3 | +The official Python SDK for interacting with the ScrapeGraph AI API - a powerful web scraping and data extraction service. |
| 4 | + |
| 5 | +## Installation |
| 6 | + |
| 7 | +Install the package using pip: |
| 8 | +```bash |
| 9 | +pip install scrapegraph-py |
| 10 | +``` |
| 11 | + |
| 12 | +## Authentication |
| 13 | + |
| 14 | +To use the ScrapeGraph API, you'll need an API key. You can manage this in two ways: |
| 15 | + |
| 16 | +1. Environment variable: |
| 17 | +```bash |
| 18 | +export SCRAPEGRAPH_API_KEY="your-api-key-here" |
| 19 | +``` |
| 20 | + |
| 21 | +2. `.env` file: |
| 22 | +```plaintext |
| 23 | +SCRAPEGRAPH_API_KEY="your-api-key-here" |
| 24 | +``` |
| 25 | + |
| 26 | +## Features |
| 27 | + |
| 28 | +The SDK provides four main functionalities: |
| 29 | + |
| 30 | +1. Web Scraping (basic and structured) |
| 31 | +2. Credits checking |
| 32 | +3. Feedback submission |
| 33 | +4. API status checking |
| 34 | + |
| 35 | +## Usage |
| 36 | + |
| 37 | + |
| 38 | +### Structured Data Extraction |
| 39 | + |
| 40 | +For more structured data extraction, you can define a Pydantic schema: |
| 41 | + |
| 42 | +```python |
| 43 | +from pydantic import BaseModel, Field |
| 44 | +from scrapegraph_py import scrape |
| 45 | + |
| 46 | +class CompanyInfoSchema(BaseModel): |
| 47 | + company_name: str = Field(description="The name of the company") |
| 48 | + description: str = Field(description="A description of the company") |
| 49 | + main_products: list[str] = Field(description="The main products of the company") |
| 50 | + |
| 51 | +# Scrape with schema |
| 52 | +result = scrape( |
| 53 | + api_key=api_key, |
| 54 | + url="https://scrapegraphai.com/", |
| 55 | + prompt="What does the company do?", |
| 56 | + schema=CompanyInfoSchema |
| 57 | +) |
| 58 | +print(result) |
| 59 | +``` |
| 60 | + |
| 61 | +### Check Credits |
| 62 | + |
| 63 | +Monitor your API usage: |
| 64 | + |
| 65 | +```python |
| 66 | +from scrapegraph_py import credits |
| 67 | + |
| 68 | +response = credits(api_key) |
| 69 | +print(response) |
| 70 | +``` |
| 71 | + |
| 72 | +### Provide Feedback and Check Status |
| 73 | + |
| 74 | +You can provide feedback on scraping results and check the API status: |
| 75 | + |
| 76 | +```python |
| 77 | +from scrapegraph_py import feedback, status |
| 78 | + |
| 79 | +# Check API status |
| 80 | +status_response = status(api_key) |
| 81 | +print(f"API Status: {status_response}") |
| 82 | + |
| 83 | +# Submit feedback |
| 84 | +feedback_response = feedback( |
| 85 | + api_key=api_key, |
| 86 | + request_id="your-request-id", # UUID from your scraping request |
| 87 | + rating=5, # Rating from 1-5 |
| 88 | + message="Great results!" |
| 89 | +) |
| 90 | +print(f"Feedback Response: {feedback_response}") |
| 91 | +``` |
| 92 | + |
| 93 | +## Development |
| 94 | + |
| 95 | +### Requirements |
| 96 | + |
| 97 | +- Python 3.9+ |
| 98 | +- [Rye](https://rye-up.com/) for dependency management (optional) |
| 99 | + |
| 100 | +### Project Structure |
| 101 | + |
| 102 | +``` |
| 103 | +scrapegraph_py/ |
| 104 | +├── __init__.py |
| 105 | +├── credits.py # Credits checking functionality |
| 106 | +├── scrape.py # Core scraping functionality |
| 107 | +└── feedback.py # Feedback submission functionality |
| 108 | +
|
| 109 | +examples/ |
| 110 | +├── credits_example.py |
| 111 | +├── feedback_example.py |
| 112 | +├── scrape_example.py |
| 113 | +└── scrape_schema_example.py |
| 114 | +
|
| 115 | +tests/ |
| 116 | +├── test_credits.py |
| 117 | +├── test_feedback.py |
| 118 | +└── test_scrape.py |
| 119 | +``` |
| 120 | + |
| 121 | +### Setting up the Development Environment |
| 122 | + |
| 123 | +1. Clone the repository: |
| 124 | +```bash |
| 125 | +git clone https://github.com/yourusername/scrapegraph-py.git |
| 126 | +cd scrapegraph-py |
| 127 | +``` |
| 128 | + |
| 129 | +2. Install dependencies: |
| 130 | +```bash |
| 131 | +# If using Rye |
| 132 | +rye sync |
| 133 | + |
| 134 | +# If using pip |
| 135 | +pip install -r requirements-dev.lock |
| 136 | +``` |
| 137 | + |
| 138 | +3. Create a `.env` file in the root directory: |
| 139 | +```plaintext |
| 140 | +SCRAPEGRAPH_API_KEY="your-api-key-here" |
| 141 | +``` |
| 142 | + |
| 143 | +## License |
| 144 | + |
| 145 | +This project is licensed under the MIT License. |
| 146 | + |
| 147 | +## Support |
| 148 | + |
| 149 | +For support: |
| 150 | +- Visit [ScrapeGraph AI](https://scrapegraphai.com/) |
| 151 | +- Contact our support team |
| 152 | +- Check the examples in the `examples/` directory |
1 | 153 |
|
2 | | -Official SDK for ScrapeGraphAI API. |
|
0 commit comments