|
2 | 2 | title: Apify |
3 | 3 | --- |
4 | 4 |
|
5 | | -**ApifyTools** enable an Agent to access the Apify API and run actors. |
| 5 | +This guide demonstrates how to integrate and use [Apify](https://apify.com/actors) Actors within the Agno framework to enhance your AI agents with very powerful data collection capabilities. |
| 6 | + |
| 7 | +## What is Apify? |
| 8 | + |
| 9 | +[Apify](https://apify.com/) is a platform that provides: |
| 10 | + |
| 11 | +- Data collection services for AI Agents |
| 12 | +- A marketplace of ready-to-use Actors (specialized tools) for various data tasks |
| 13 | +- Infrastructure to run and monetize our own AI Agents |
6 | 14 |
|
7 | 15 | ## Prerequisites |
8 | 16 |
|
9 | | -The following example requires the `apify-client` library and an API token which can be obtained from [Apify](https://apify.com/). |
| 17 | +1. Sign up for an Apify account |
| 18 | +2. Obtain your Apify API token (can be obtained from [Apify](https://docs.apify.com/platform/integrations/api)) |
| 19 | +3. Install the required packages: |
| 20 | + |
| 21 | + |
| 22 | +```bash |
| 23 | +pip install agno apify-client |
| 24 | +``` |
| 25 | + |
| 26 | +## Basic Usage |
| 27 | + |
| 28 | +The Agno framework makes it easy to integrate Apify Actors into your agents. Here's a simple example: |
| 29 | + |
| 30 | +```python |
| 31 | +from agno.agent import Agent |
| 32 | +from agno.tools.apify import ApifyTools |
| 33 | + |
| 34 | +# Create an agent with ApifyTools |
| 35 | +agent = Agent( |
| 36 | + tools=[ |
| 37 | + ApifyTools( |
| 38 | + api_key="your_apify_api_key" # Or set the APIFY_TOKEN environment variable |
| 39 | + ) |
| 40 | + ], |
| 41 | + show_tool_calls=True, |
| 42 | + markdown=True |
| 43 | +) |
| 44 | + |
| 45 | +# Use the agent to get website content |
| 46 | +agent.print_response("What information can you find on https://docs.agno.com/introduction ?", markdown=True) |
| 47 | +``` |
| 48 | + |
| 49 | +## Available Apify Tools |
| 50 | + |
| 51 | +You can easily integrate any Apify Actor as a tool, but we prepared some examples for you: |
| 52 | + |
| 53 | +### 1. RAG Web Browser |
| 54 | + |
| 55 | +The RAG Web Browser actor is specifically designed for AI and LLM applications. It searches the web for a query or processes a URL, then cleans and formats the content for your agent. This tool is enabled by default. |
| 56 | + |
| 57 | +```python |
| 58 | +from agno.agent import Agent |
| 59 | +from agno.tools.apify import ApifyTools |
| 60 | + |
| 61 | +agent = Agent( |
| 62 | + tools=[ |
| 63 | + ApifyTools() # RAG web search is enabled by default |
| 64 | + ], |
| 65 | + show_tool_calls=True, |
| 66 | + markdown=True |
| 67 | +) |
| 68 | + |
| 69 | +# Search for information and process the results |
| 70 | +agent.print_response("What are the latest developments in large language models?", markdown=True) |
| 71 | +``` |
| 72 | + |
| 73 | +### 2. Website Content Crawler |
| 74 | + |
| 75 | +This tool uses Apify's Website Content Crawler actor to extract text content from websites, making it perfect for RAG applications. |
| 76 | + |
| 77 | +```python |
| 78 | +from agno.agent import Agent |
| 79 | +from agno.tools.apify import ApifyTools |
| 80 | + |
| 81 | +agent = Agent( |
| 82 | + tools=[ |
| 83 | + ApifyTools( |
| 84 | + use_website_content_crawler=True # Disabled by default |
| 85 | + ) |
| 86 | + ], |
| 87 | + markdown=True |
| 88 | +) |
10 | 89 |
|
11 | | -```shell |
12 | | -pip install -U apify-client |
| 90 | +# Ask the agent to process web content |
| 91 | +agent.print_response("Summarize the content from https://docs.agno.com/introduction", markdown=True) |
13 | 92 | ``` |
14 | 93 |
|
15 | | -```shell |
16 | | -export MY_APIFY_TOKEN=*** |
| 94 | +### 3. Web Scraper |
| 95 | + |
| 96 | +The Web Scraper tool uses Apify's Web Scraper actor to extract structured data from websites. |
| 97 | + |
| 98 | +```python |
| 99 | +from agno.agent import Agent |
| 100 | +from agno.tools.apify import ApifyTools |
| 101 | + |
| 102 | +agent = Agent( |
| 103 | + tools=[ |
| 104 | + ApifyTools( |
| 105 | + use_web_scraper=True # Disabled by default |
| 106 | + ) |
| 107 | + ], |
| 108 | + show_tool_calls=True |
| 109 | +) |
| 110 | + |
| 111 | +# Extract specific elements from a webpage |
| 112 | +agent.print_response("Extract the main heading and first paragraph from https://www.example.com", markdown=True) |
| 113 | +``` |
| 114 | + |
| 115 | +### 4. Instagram Scraper |
| 116 | + |
| 117 | +The Instagram Scraper tool allows your agent to extract information from Instagram profiles, hashtags, or places. |
| 118 | + |
| 119 | +```python |
| 120 | +from agno.agent import Agent |
| 121 | +from agno.tools.apify import ApifyTools |
| 122 | + |
| 123 | +agent = Agent( |
| 124 | + tools=[ |
| 125 | + ApifyTools( |
| 126 | + use_instagram_scraper=True # Enabled by default |
| 127 | + ) |
| 128 | + ], |
| 129 | + show_tool_calls=True |
| 130 | +) |
| 131 | + |
| 132 | +# Extract information from Instagram |
| 133 | +agent.print_response("Find trending posts for the hashtag #AI", markdown=True) |
| 134 | +agent.print_response("Get information about the Instagram user 'Instagram'", markdown=True) |
| 135 | +``` |
| 136 | + |
| 137 | +### 5. Google Places Crawler |
| 138 | + |
| 139 | +This tool extracts data about businesses from Google Maps and Google Places. |
| 140 | + |
| 141 | +```python |
| 142 | +from agno.agent import Agent |
| 143 | +from agno.tools.apify import ApifyTools |
| 144 | + |
| 145 | +agent = Agent( |
| 146 | + tools=[ |
| 147 | + ApifyTools( |
| 148 | + use_google_places_crawler=True # Enabled by default |
| 149 | + ) |
| 150 | + ], |
| 151 | + show_tool_calls=True |
| 152 | +) |
| 153 | + |
| 154 | +# Find business information in a specific location |
| 155 | +agent.print_response("What are the top-rated restaurants in San Francisco?", markdown=True) |
| 156 | +agent.print_response("Find coffee shops in Prague", markdown=True) |
17 | 157 | ``` |
18 | 158 |
|
19 | | -## Example |
| 159 | +## Example Scenarios |
20 | 160 |
|
21 | | -The following agent will use Apify to crawl the webpage: https://docs.agno.com/introduction and summarize it. |
| 161 | +### RAG Web Browser + Google Places Crawler |
| 162 | +This example combines web search with local business data to provide comprehensive information about a topic: |
22 | 163 |
|
23 | | -```python cookbook/tools/apify_tools.py |
| 164 | +```python |
24 | 165 | from agno.agent import Agent |
25 | 166 | from agno.tools.apify import ApifyTools |
26 | 167 |
|
27 | | -agent = Agent(tools=[ApifyTools()], show_tool_calls=True) |
28 | | -agent.print_response("Tell me about https://docs.agno.com/introduction", markdown=True) |
| 168 | +agent = Agent( |
| 169 | + tools=[ |
| 170 | + ApifyTools( |
| 171 | + use_rag_web_search=True, |
| 172 | + use_google_places_crawler=True |
| 173 | + ) |
| 174 | + ], |
| 175 | + show_tool_calls=True |
| 176 | +) |
| 177 | + |
| 178 | +# Get general information and local businesses |
| 179 | +agent.print_response( |
| 180 | + """ |
| 181 | + I'm traveling to Tokyo next month. |
| 182 | + 1. Research the best time to visit and major attractions |
| 183 | + 2. Find highly-rated sushi restaurants near Shinjuku |
| 184 | + Compile a comprehensive travel guide with this information. |
| 185 | + """, |
| 186 | + markdown=True |
| 187 | +) |
| 188 | +``` |
| 189 | + |
| 190 | +## Implementation Details |
| 191 | + |
| 192 | +Below is a simplified implementation reference for the ApifyTools class. You can find the complete implementation in the source code. |
| 193 | + |
| 194 | +```python |
| 195 | +from agno.tools import Toolkit |
| 196 | +from apify_client import ApifyClient |
| 197 | + |
| 198 | +class ApifyTools(Toolkit): |
| 199 | + def __init__( |
| 200 | + self, |
| 201 | + api_key=None, |
| 202 | + max_results=4, |
| 203 | + use_rag_web_search=True, |
| 204 | + use_website_content_crawler=False, |
| 205 | + use_web_scraper=False, |
| 206 | + use_instagram_scraper=True, |
| 207 | + use_google_places_crawler=True |
| 208 | + ): |
| 209 | + # Setup code... |
| 210 | + |
| 211 | + def rag_web_search(self, query, timeout=45): |
| 212 | + # Implementation... |
| 213 | + |
| 214 | + def website_content_crawler(self, urls, timeout=60): |
| 215 | + # Implementation... |
| 216 | + |
| 217 | + def web_scraper(self, urls, timeout=60): |
| 218 | + # Implementation... |
| 219 | + |
| 220 | + def instagram_scraper(self, search, search_type="user", search_limit=10, timeout=180): |
| 221 | + # Implementation... |
| 222 | + |
| 223 | + def google_places_crawler(self, location_query, search_terms=None, max_crawled_places=30, timeout=45): |
| 224 | + # Implementation... |
29 | 225 | ``` |
30 | 226 |
|
31 | 227 | ## Toolkit Params |
32 | 228 |
|
33 | | -| Parameter | Type | Default | Description | |
34 | | -| ------------------------- | ------ | ------- | --------------------------------------------------------------------------------- | |
35 | | -| `api_key` | `str` | - | API key for authentication purposes. | |
36 | | -| `website_content_crawler` | `bool` | `True` | Enables the functionality to crawl a website using website-content-crawler actor. | |
37 | | -| `web_scraper` | `bool` | `False` | Enables the functionality to crawl a website using web_scraper actor. | |
| 229 | +| Parameter | Type | Default | Description | |
| 230 | +| ---------------------------- | ------- | ------- | ---------------------------------------------------------------- | |
| 231 | +| `api_key` | `str` | `None` | Apify API key (or set via APIFY_TOKEN environment variable) | |
| 232 | +| `max_results` | `int` | `4` | Maximum number of results for web searches | |
| 233 | +| `use_rag_web_search` | `bool` | `True` | Enable RAG web search tool | |
| 234 | +| `use_website_content_crawler`| `bool` | `False` | Enable website content crawler tool | |
| 235 | +| `use_web_scraper` | `bool` | `False` | Enable general web scraper tool | |
| 236 | +| `use_instagram_scraper` | `bool` | `True` | Enable Instagram scraper tool | |
| 237 | +| `use_google_places_crawler` | `bool` | `True` | Enable Google Places crawler tool | |
38 | 238 |
|
39 | 239 | ## Toolkit Functions |
40 | 240 |
|
41 | | -| Function | Description | |
42 | | -| ------------------------- | ------------------------------------------------------------- | |
43 | | -| `website_content_crawler` | Crawls a website using Apify's website-content-crawler actor. | |
44 | | -| `web_scrapper` | Scrapes a website using Apify's web-scraper actor. | |
| 241 | +| Function | Description | |
| 242 | +| -------------------------- | ---------------------------------------------------------------- | |
| 243 | +| `rag_web_search` | Searches the web for information using the RAG Web Browser actor | |
| 244 | +| `website_content_crawler` | Crawls websites using Apify's website-content-crawler actor | |
| 245 | +| `web_scraper` | Scrapes websites using Apify's web-scraper actor | |
| 246 | +| `instagram_scraper` | Scrapes Instagram profiles, hashtags, or places | |
| 247 | +| `google_places_crawler` | Crawls Google Places for business information | |
45 | 248 |
|
46 | 249 | ## Developer Resources |
47 | 250 |
|
48 | 251 | - View [Tools](https://github.com/agno-agi/agno/blob/main/libs/agno/agno/tools/apify.py) |
49 | 252 | - View [Cookbook](https://github.com/agno-agi/agno/blob/main/cookbook/tools/apify_tools.py) |
| 253 | + |
| 254 | +## Resources |
| 255 | + |
| 256 | +- [Apify Platform Documentation](https://docs.apify.com) |
| 257 | +- [Apify Actor Documentation](https://docs.apify.com/Actors) |
| 258 | +- [Apify Store - Browse available Actors](https://apify.com/store) |
| 259 | +- [Agno Framework Documentation](https://docs.agno.com) |
0 commit comments