Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 0 additions & 59 deletions docs/01_overview/01_introduction.mdx

This file was deleted.

66 changes: 0 additions & 66 deletions docs/01_overview/02_running_actors_locally.mdx

This file was deleted.

35 changes: 0 additions & 35 deletions docs/01_overview/03_actor_structure.mdx

This file was deleted.

163 changes: 163 additions & 0 deletions docs/01_overview/overview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
---

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General note:
Is it possible to move the overview page one level up, so that it is not in the Overview section? It's kinda weird UX-wise.
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was afraid that might happen. I think if we rename it to index.md it should not be separate openable section

title: Overview
sidebar_label: Overview
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';

The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python.

```python
from apify import Actor
from bs4 import BeautifulSoup
import requests

async def main():
async with Actor:
input = await Actor.get_input()
response = requests.get(input['url'])
soup = BeautifulSoup(response.content, 'html.parser')
await Actor.push_data({ 'url': input['url'], 'title': soup.title.string })
```

## Requirements

The Apify SDK requires Python version 3.8 or above to run Python Actors locally.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs previously said that the required version is 3.10 or above. Is this updated version correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll reach out and confirm the required version as client & sdk diverge here


## Installation

The Apify Python SDK is available as [`apify`](https://pypi.org/project/apify/) package on PyPi. To install it, run:

```bash
pip install apify
```

When you create an Actor using the Apify CLI, the Apify SDK for Python is installed for you automatically.

If you are not developing Apify Actors and you just need to access the Apify API from Python,
consider using the [Apify API client for Python](https://docs.apify.com/api/client/python) directly.

## Quick start

### Creating Actors

To create and run Actors through Apify Console, refer to the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"through Apify Console"
Is through used in this context? I'd rather just use "in".

Suggested change
To create and run Actors through Apify Console, refer to the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template).
To create and run Actors in Apify Console, refer to the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template).


To create a new Apify Actor on your computer, you can use the [Apify CLI](https://docs.apify.com/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python).

For example, to create an Actor from the "[beta] Python SDK" template, you can use the [`apify create` command](https://docs.apify.com/cli/docs/reference#apify-create-actorname).

```bash
apify create my-first-actor --template python-start
```

This will create a new folder called `my-first-actor`, download and extract the "Getting started with Python" Actor template there, create a virtual environment in `my-first-actor/.venv`, and install the Actor dependencies in it.

### Running the Actor

To run the Actor, you can use the [`apify run` command](https://docs.apify.com/cli/docs/reference#apify-run):

```bash
cd my-first-actor
apify run
```

This will activate the virtual environment in `.venv` (if no other virtual environment is activated yet), then start the Actor, passing the right environment variables for local running, and configure it to use local storages from the `storage` folder.

The Actor input, for example, will be in `storage/key_value_stores/default/INPUT.json`.

## Actor structure

All Python Actor templates follow the same structure.

The `.actor` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform.

The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/).

The Actor's source code is in the `src` folder. This folder contains two important files:

- `main.py` - which contains the main function of the Actor
- `__main__.py` - which is the entrypoint of the Actor package setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run).

<Tabs>
<TabItem value="main.py" label="main.py" default>
<CodeBlock language="python">{
`from apify import Actor
${''}
async def main():
async with Actor:
Actor.log.info('Actor input:', await Actor.get_input())
await Actor.set_value('OUTPUT', 'Hello, world!')`
}</CodeBlock>
</TabItem>
<TabItem value="__main__.py" label="__main.py__">
<CodeBlock language="python">{
`import asyncio
import logging
${''}
from apify.log import ActorLogFormatter
${''}
from .main import main
${''}
handler = logging.StreamHandler()
handler.setFormatter(ActorLogFormatter())
${''}
apify_logger = logging.getLogger('apify')
apify_logger.setLevel(logging.DEBUG)
apify_logger.addHandler(handler)
${''}
asyncio.run(main())`
}</CodeBlock>
</TabItem>
</Tabs>
Comment on lines +84 to +114

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding some code comments?


If you want to modify the Actor structure, you need to make sure that your Actor is executable as a module, via `python -m src`, as that is the command started by `apify run` in the Apify CLI.
We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file.

## Adding dependencies

First, add them in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
First, add them in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder.
First, add the dependencies in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder.


Then activate the virtual environment in `.venv`:

<Tabs groupId="operating-systems">
<TabItem value="unix" label="Linux / macOS" default>
<CodeBlock language="bash">{
`source .venv/bin/activate`
}</CodeBlock>
</TabItem>
<TabItem value="win" label="Windows">
<CodeBlock language="powershell">{
`.venv\\Scripts\\activate`
}</CodeBlock>
</TabItem>
</Tabs>

Then install the dependencies:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Then install the dependencies:
Finally, install the dependencies:


```bash
python -m pip install -r requirements.txt
```

## Next steps

### Guides

To see how you can integrate the Apify SDK with some of the most popular web scraping libraries, check out our guides for working with:

- [Requests or HTTPX](../guides/requests-and-httpx)
- [Beautiful Soup](../guides/beautiful-soup)
- [Playwright](../guides/playwright)
- [Selenium](../guides/selenium)
- [Scrapy](../guides/scrapy)

### Usage concepts

To learn more about the features of the Apify SDK and how to use them, check out the Usage Concepts section in the sidebar, especially the guides for:

- [Actor lifecycle](../concepts/actor-lifecycle)
- [Working with storages](../concepts/storages)
- [Handling Actor events](../concepts/actor-events)
- [How to use proxies](../concepts/proxy-management)