Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 0 additions & 59 deletions docs/01_overview/01_introduction.mdx

This file was deleted.

66 changes: 0 additions & 66 deletions docs/01_overview/02_running_actors_locally.mdx

This file was deleted.

35 changes: 0 additions & 35 deletions docs/01_overview/03_actor_structure.mdx

This file was deleted.

163 changes: 163 additions & 0 deletions docs/01_overview/overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
---
title: Overview
sidebar_label: Overview
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';

The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python.

```python
from apify import Actor
from bs4 import BeautifulSoup
import requests

async def main():
async with Actor:
input = await Actor.get_input()
response = requests.get(input['url'])
soup = BeautifulSoup(response.content, 'html.parser')
await Actor.push_data({ 'url': input['url'], 'title': soup.title.string })
```

## Requirements

The Apify SDK requires Python version 3.8 or above to run Python Actors locally.

## Installation

The Apify Python SDK is available as [`apify`](https://pypi.org/project/apify/) package on PyPi. To install it, run:

```bash
pip install apify
```

When you create an Actor using the Apify CLI, the Apify SDK for Python is installed for you automatically.

If you are not developing Apify Actors and you just need to access the Apify API from Python,
consider using the [Apify API client for Python](https://docs.apify.com/api/client/python) directly.

## Quick start

### Creating Actors

To create and run Actors through Apify Console, refer to the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template).

To create a new Apify Actor on your computer, you can use the [Apify CLI](https://docs.apify.com/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python).

For example, to create an Actor from the "[beta] Python SDK" template, you can use the [`apify create` command](https://docs.apify.com/cli/docs/reference#apify-create-actorname).

```bash
apify create my-first-actor --template python-start
```

This will create a new folder called `my-first-actor`, download and extract the "Getting started with Python" Actor template there, create a virtual environment in `my-first-actor/.venv`, and install the Actor dependencies in it.

### Running the Actor

To run the Actor, you can use the [`apify run` command](https://docs.apify.com/cli/docs/reference#apify-run):

```bash
cd my-first-actor
apify run
```

This will activate the virtual environment in `.venv` (if no other virtual environment is activated yet), then start the Actor, passing the right environment variables for local running, and configure it to use local storages from the `storage` folder.

The Actor input, for example, will be in `storage/key_value_stores/default/INPUT.json`.

## Actor structure

All Python Actor templates follow the same structure.

The `.actor` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform.

The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/).

The Actor's source code is in the `src` folder. This folder contains two important files:

- `main.py` - which contains the main function of the Actor
- `__main__.py` - which is the entrypoint of the Actor package setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run).

<Tabs>
<TabItem value="main.py" label="main.py" default>
<CodeBlock language="python">{
`from apify import Actor
${''}
async def main():
async with Actor:
Actor.log.info('Actor input:', await Actor.get_input())
await Actor.set_value('OUTPUT', 'Hello, world!')`
}</CodeBlock>
</TabItem>
<TabItem value="__main__.py" label="__main.py__">
<CodeBlock language="python">{
`import asyncio
import logging
${''}
from apify.log import ActorLogFormatter
${''}
from .main import main
${''}
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())
${''}
apify_logger = logging.getLogger('apify')
apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
${''}
asyncio.run(main())`
}</CodeBlock>
</TabItem>
</Tabs>

If you want to modify the Actor structure, you need to make sure that your Actor is executable as a module, via `python -m src`, as that is the command started by `apify run` in the Apify CLI.
We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file.

## Adding dependencies

First, add them in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder.

Then activate the virtual environment in `.venv`:

<Tabs groupId="operating-systems">
<TabItem value="unix" label="Linux / macOS" default>
<CodeBlock language="bash">{
`source .venv/bin/activate`
}</CodeBlock>
</TabItem>
<TabItem value="win" label="Windows">
<CodeBlock language="powershell">{
`.venv\\Scripts\\activate`
}</CodeBlock>
</TabItem>
</Tabs>

Then install the dependencies:

```bash
python -m pip install -r requirements.txt
```

## Next steps

### Guides

To see how you can integrate the Apify SDK with some of the most popular web scraping libraries, check out our guides for working with:

- [Requests or HTTPX](../guides/requests-and-httpx)
- [Beautiful Soup](../guides/beautiful-soup)
- [Playwright](../guides/playwright)
- [Selenium](../guides/selenium)
- [Scrapy](../guides/scrapy)

### Usage concepts

To learn more about the features of the Apify SDK and how to use them, check out the Usage Concepts section in the sidebar, especially the guides for:

- [Actor lifecycle](../concepts/actor-lifecycle)
- [Working with storages](../concepts/storages)
- [Handling Actor events](../concepts/actor-events)
- [How to use proxies](../concepts/proxy-management)