|
| 1 | +--- |
| 2 | +title: Overview |
| 3 | +sidebar_label: Overview |
| 4 | +--- |
| 5 | + |
| 6 | +import Tabs from '@theme/Tabs'; |
| 7 | +import TabItem from '@theme/TabItem'; |
| 8 | +import CodeBlock from '@theme/CodeBlock'; |
| 9 | + |
| 10 | +The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. |
| 11 | + |
| 12 | +```py |
| 13 | +from apify import Actor |
| 14 | +from bs4 import BeautifulSoup |
| 15 | +import requests |
| 16 | + |
| 17 | +async def main(): |
| 18 | + async with Actor: |
| 19 | + input = await Actor.get_input() |
| 20 | + response = requests.get(input['url']) |
| 21 | + soup = BeautifulSoup(response.content, 'html.parser') |
| 22 | + await Actor.push_data({ 'url': input['url'], 'title': soup.title.string }) |
| 23 | +``` |
| 24 | + |
| 25 | +## Requirements |
| 26 | + |
| 27 | +The Apify SDK requires Python version 3.10 or above to run Python Actors locally. |
| 28 | + |
| 29 | +## Installation |
| 30 | + |
| 31 | +The Apify Python SDK is available as [`apify`](https://pypi.org/project/apify/) package on PyPi. To install it, run: |
| 32 | + |
| 33 | +```bash |
| 34 | +pip install apify |
| 35 | +``` |
| 36 | + |
| 37 | +When you create an Actor using the Apify CLI, the Apify SDK for Python is installed for you automatically. |
| 38 | + |
| 39 | +If you are not developing Apify Actors and you just need to access the Apify API from Python, |
| 40 | +consider using the [Apify API client for Python](/api/client/python) directly. |
| 41 | + |
| 42 | +## Quick start |
| 43 | + |
| 44 | +### Creating Actors |
| 45 | + |
| 46 | +To create and run Actors in Apify Console, refer to the [Console documentation](/platform/actors/development/quick-start/web-ide). |
| 47 | + |
| 48 | +To create a new Apify Actor on your computer, you can use the [Apify CLI](/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python). |
| 49 | + |
| 50 | +For example, to create an Actor from the "[beta] Python SDK" template, you can use the [`apify create` command](/cli/docs/reference#apify-create-actorname). |
| 51 | + |
| 52 | +```bash |
| 53 | +apify create my-first-actor --template python-start |
| 54 | +``` |
| 55 | + |
| 56 | +This will create a new folder called `my-first-actor`, download and extract the "Getting started with Python" Actor template there, create a virtual environment in `my-first-actor/.venv`, and install the Actor dependencies in it. |
| 57 | + |
| 58 | + |
| 59 | + |
| 60 | +#### Running the Actor |
| 61 | + |
| 62 | +To run the Actor, you can use the [`apify run` command](/cli/docs/reference#apify-run): |
| 63 | + |
| 64 | +```bash |
| 65 | +cd my-first-actor |
| 66 | +apify run |
| 67 | +``` |
| 68 | + |
| 69 | +This command: |
| 70 | + |
| 71 | +- Activates the virtual environment in `.venv` (if no other virtual environment is activated yet) |
| 72 | +- Starts the Actor with the appropriate environment variables for local running |
| 73 | +- Configures it to use local storages from the `storage` folder |
| 74 | + |
| 75 | +The Actor input, for example, will be in `storage/key_value_stores/default/INPUT.json`. |
| 76 | + |
| 77 | +## Actor structure |
| 78 | + |
| 79 | +All Python Actor templates follow the same structure. |
| 80 | + |
| 81 | +The `.actor` directory contains the [Actor configuration](/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. |
| 82 | + |
| 83 | +The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/). |
| 84 | + |
| 85 | +The Actor's source code is in the `src` folder. This folder contains two important files: |
| 86 | + |
| 87 | +- `main.py` - which contains the main function of the Actor |
| 88 | +- `__main__.py` - which is the entrypoint of the Actor package setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). |
| 89 | + |
| 90 | +<Tabs> |
| 91 | + <TabItem value="main.py" label="main.py" default> |
| 92 | + <CodeBlock language="python">{ |
| 93 | +`from apify import Actor |
| 94 | +${''} |
| 95 | +async def main(): |
| 96 | + async with Actor: |
| 97 | + Actor.log.info('Actor input:', await Actor.get_input()) |
| 98 | + await Actor.set_value('OUTPUT', 'Hello, world!')` |
| 99 | + }</CodeBlock> |
| 100 | + </TabItem> |
| 101 | + <TabItem value="__main__.py" label="__main.py__"> |
| 102 | + <CodeBlock language="python">{ |
| 103 | +`import asyncio |
| 104 | +import logging |
| 105 | +${''} |
| 106 | +from apify.log import ActorLogFormatter |
| 107 | +${''} |
| 108 | +from .main import main |
| 109 | +${''} |
| 110 | +handler = logging.StreamHandler() |
| 111 | +handler.setFormatter(ActorLogFormatter()) |
| 112 | +${''} |
| 113 | +apify_logger = logging.getLogger('apify') |
| 114 | +apify_logger.setLevel(logging.DEBUG) |
| 115 | +apify_logger.addHandler(handler) |
| 116 | +${''} |
| 117 | +asyncio.run(main())` |
| 118 | + }</CodeBlock> |
| 119 | + </TabItem> |
| 120 | +</Tabs> |
| 121 | + |
| 122 | +If you want to modify the Actor structure, you need to make sure that your Actor is executable as a module, via `python -m src`, as that is the command started by `apify run` in the Apify CLI. |
| 123 | +We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file. |
| 124 | + |
| 125 | +## Adding dependencies |
| 126 | + |
| 127 | +First, add the dependencies in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder. |
| 128 | + |
| 129 | +Then activate the virtual environment in `.venv`: |
| 130 | + |
| 131 | +<Tabs groupId="operating-systems"> |
| 132 | + <TabItem value="unix" label="Linux / macOS" default> |
| 133 | + <CodeBlock language="bash">{ |
| 134 | +`source .venv/bin/activate` |
| 135 | + }</CodeBlock> |
| 136 | + </TabItem> |
| 137 | + <TabItem value="win" label="Windows"> |
| 138 | + <CodeBlock language="powershell">{ |
| 139 | +`.venv\\Scripts\\activate` |
| 140 | + }</CodeBlock> |
| 141 | + </TabItem> |
| 142 | +</Tabs> |
| 143 | + |
| 144 | +Finally, install the dependencies: |
| 145 | + |
| 146 | +```bash |
| 147 | +python -m pip install -r requirements.txt |
| 148 | +``` |
| 149 | + |
| 150 | +## Next steps |
| 151 | + |
| 152 | +### Guides |
| 153 | + |
| 154 | +To see how you can integrate the Apify SDK with some of the most popular web scraping libraries, check out our guides for working with: |
| 155 | + |
| 156 | +- [Requests or HTTPX](../guides/requests-and-httpx) |
| 157 | +- [Beautiful Soup](../guides/beautiful-soup) |
| 158 | +- [Playwright](../guides/playwright) |
| 159 | +- [Selenium](../guides/selenium) |
| 160 | +- [Scrapy](../guides/scrapy) |
| 161 | + |
| 162 | +### Usage concepts |
| 163 | + |
| 164 | +To learn more about the features of the Apify SDK and how to use them, check out the Usage Concepts section in the sidebar, especially the guides for: |
| 165 | + |
| 166 | +- [Actor lifecycle](../concepts/actor-lifecycle) |
| 167 | +- [Working with storages](../concepts/storages) |
| 168 | +- [Handling Actor events](../concepts/actor-events) |
| 169 | +- [How to use proxies](../concepts/proxy-management) |
0 commit comments