|
2 | 2 |
|
3 | 3 | ## Introduction |
4 | 4 |
|
5 | | -Web Cat is a serverless Python-based API hosted on Azure Functions, designed to scrape and process website content responsibly. Leveraging the readability library and BeautifulSoup, `Web Cat` extracts the main body of text and related images from web pages, making it easy to integrate website content ChatGPT through the use of Custom GPTs. |
| 5 | +Web Cat is a collection of Python-based APIs designed to enhance AI models with web search and content extraction capabilities. The project includes: |
6 | 6 |
|
7 | | -Using the `@Web Cat` GPT enhances ideation by seamlessly integrating web content into conversations, eliminating the need for manual copy-pasting or suffering through out dated data issues. |
| 7 | +1. A serverless Python-based API hosted on Azure Functions |
| 8 | +2. A Model Context Protocol (MCP) server that provides web search capabilities for AI models |
| 9 | + |
| 10 | +Both implementations are designed to responsibly scrape and process website content, making it easy to integrate web content into AI applications like ChatGPT through Custom GPTs. |
| 11 | + |
| 12 | +## Components |
| 13 | + |
| 14 | +### Azure Functions API |
| 15 | + |
| 16 | +The Azure Functions API leverages the readability library and BeautifulSoup to extract the main body of text and related images from web pages. |
| 17 | + |
| 18 | +### MCP Server |
| 19 | + |
| 20 | +The Model Context Protocol (MCP) server is a FastAPI-based implementation that provides web search capabilities with enhanced content extraction. It follows the MCP specification for standardized AI model interactions. |
8 | 21 |
|
9 | 22 | ## Features |
10 | | - - **Content Extraction**: Utilizes the readability library for clean text extraction. |
11 | | - - **Text Processing**: Further processes extracted content for improved usability. |
12 | | - - **Search Functionality**: Integrates with Serper.dev to provide web search capabilities. |
| 23 | + - **Content Extraction**: Utilizes the readability library for clean text extraction |
| 24 | + - **Text Processing**: Further processes extracted content for improved usability |
| 25 | + - **Search Functionality**: Integrates with Serper.dev to provide web search capabilities |
| 26 | + - **MCP Compliance**: Follows standardized Model Context Protocol specifications |
| 27 | + - **Rate Limiting**: Protects the API from abuse with configurable rate limits |
| 28 | + - **API Versioning**: Ensures backward compatibility as the API evolves |
13 | 29 |
|
14 | 30 | ## Getting Started |
15 | 31 |
|
16 | | -### Prerequisites |
| 32 | +For the Azure Functions API: |
| 33 | +- See the `customgpt` directory for specific documentation |
17 | 34 |
|
18 | | -- Azure Functions Core Tools |
19 | | -- Python 3.11 |
20 | | -- An Azure account and subscription |
| 35 | +For the MCP Server: |
| 36 | +- See the `docker` directory for build and deployment instructions |
21 | 37 |
|
22 | | -## Local Development |
| 38 | +## Limitations and Considerations |
| 39 | +- **Text-Based Content**: The APIs are optimized for text and image content and may not accurately represent other multimedia or dynamic web content. |
| 40 | +- **API Keys**: A Serper API key is required for search functionality |
23 | 41 |
|
24 | | -Prepare your local environment by running: |
| 42 | +## Contributing |
25 | 43 |
|
26 | | -```bash |
27 | | -cd src |
28 | | -pip install -r requirements.txt |
29 | | -func start |
30 | | -``` |
| 44 | +Contributions are welcome! Please feel free to submit a Pull Request. |
31 | 45 |
|
32 | | -## Limitations and Considerations |
33 | | -- **Text-Based Content**: The API is optimized for text and image content and may not accurately represent other multimedia or dynamic web content. |
| 46 | +## License |
| 47 | + |
| 48 | +This project is licensed under the terms of the license included in the repository. |
34 | 49 |
|
35 | 50 | ## Usage |
36 | 51 |
|
|
0 commit comments