diff --git a/docs/01_overview/01_introduction.mdx b/docs/01_overview/01_introduction.mdx deleted file mode 100644 index 7e5bc56e..00000000 --- a/docs/01_overview/01_introduction.mdx +++ /dev/null @@ -1,59 +0,0 @@ ---- -id: introduction -title: Introduction ---- - -import RunnableCodeBlock from '@site/src/components/RunnableCodeBlock'; - -import IntroductionExample from '!!raw-loader!roa-loader!./code/01_introduction.py'; - -The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) using Python. - - - {IntroductionExample} - - -## What are Actors? - -Actors are serverless cloud programs capable of performing tasks in a web browser, similar to what a human can do. These tasks can range from simple operations, such as filling out forms or unsubscribing from services, to complex jobs like scraping and processing large numbers of web pages. - -Actors can be executed locally or on the [Apify platform](https://docs.apify.com/platform/), which provides features for running them at scale, monitoring, scheduling, and even publishing and monetizing them. - -If you're new to Apify, refer to the Apify platform documentation to learn [what Apify is](https://docs.apify.com/platform/about). - -## Quick start - -This section provides a quick start guide for creating and running Actors. - -### Creating Actors - -To create and run Actors using the Apify Console, see the [Console documentation](https://docs.apify.com/platform/console). - -For creating and running Python Actors locally, refer to the documentation for [creating and running Python Actors locally](./running-actors-locally). - -### Guides - -Integrate the Apify SDK with popular web scraping libraries by following these guides: -- [BeautifulSoup with HTTPX](../guides/beautifulsoup-httpx) -- [Crawlee](../guides/crawlee) -- [Playwright](../guides/playwright) -- [Selenium](../guides/selenium) -- [Scrapy](../guides/scrapy) - -### Usage concepts - -For a deeper understanding of the Apify SDK's features, refer to the **Usage concepts** section in the sidebar. Key topics include: -- [Actor lifecycle](../concepts/actor-lifecycle) -- [Working with storages](../concepts/storages) -- [Handling Actor events](../concepts/actor-events) -- [Using proxies](../concepts/proxy-management) - -## Installing the Apify SDK separately - -When creating an Actor using the Apify CLI, the Apify SDK for Python is installed automatically. If you want to install it independently, use the following command: - -```bash -pip install apify -``` - -If your goal is not to develop Apify Actors but to interact with the Apify API from Python, consider using the [Apify API client for Python](https://docs.apify.com/api/client/python) directly. diff --git a/docs/01_overview/02_running_actors_locally.mdx b/docs/01_overview/02_running_actors_locally.mdx deleted file mode 100644 index 40a795a7..00000000 --- a/docs/01_overview/02_running_actors_locally.mdx +++ /dev/null @@ -1,66 +0,0 @@ ---- -id: running-actors-locally -title: Running Actors locally ---- - -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; -import CodeBlock from '@theme/CodeBlock'; - -In this page, you'll learn how to create and run Apify Actors locally on your computer. - -## Requirements - -The Apify SDK requires Python version 3.10 or above to run Python Actors locally. - -## Creating your first Actor - -To create a new Apify Actor on your computer, you can use the [Apify CLI](https://docs.apify.com/cli), and select one of the [Python Actor templates](https://apify.com/templates/categories/python). - -For example, to create an Actor from the Python SDK template, you can use the [`apify create`](https://docs.apify.com/cli/docs/reference#apify-create-actorname) command. - -```bash -apify create my-first-actor --template python-start -``` - -This will create a new folder called `my-first-actor`, download and extract the "Getting started with Python" Actor template there, create a virtual environment in `my-first-actor/.venv`, and install the Actor dependencies in it. - -## Running the Actor - -To run the Actor, you can use the [`apify run`](https://docs.apify.com/cli/docs/reference#apify-run) command: - -```bash -cd my-first-actor -apify run -``` - -This will activate the virtual environment in `.venv` (if no other virtual environment is activated yet), then start the Actor, passing the right environment variables for local running, and configure it to use local storages from the `storage` folder. - -The Actor input, for example, will be in `storage/key_value_stores/default/INPUT.json`. - -## Adding dependencies - -Adding dependencies into the Actor is simple. - -First, add them in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder. - -Then activate the virtual environment in `.venv`: - - - - { -`source .venv/bin/activate` - } - - - { -`.venv\\Scripts\\activate` - } - - - -Then install the dependencies: - -```bash -python -m pip install -r requirements.txt -``` diff --git a/docs/01_overview/03_actor_structure.mdx b/docs/01_overview/03_actor_structure.mdx deleted file mode 100644 index 1cff2661..00000000 --- a/docs/01_overview/03_actor_structure.mdx +++ /dev/null @@ -1,35 +0,0 @@ ---- -id: actor-structure -title: Actor structure ---- - -import Tabs from '@theme/Tabs'; -import TabItem from '@theme/TabItem'; -import CodeBlock from '@theme/CodeBlock'; - -import UnderscoreMainExample from '!!raw-loader!./code/actor_structure/__main__.py'; -import MainExample from '!!raw-loader!./code/actor_structure/main.py'; - -All Python Actor templates follow the same structure. - -The `.actor/` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. - -The Actor's runtime dependencies are specified in the `requirements.txt` file, -which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/). - -The Actor's source code is in the `src/` folder. This folder contains two important files: `main.py`, which contains the main function of the Actor, and `__main__.py`, which is the entrypoint of the Actor package, setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). - - - - - {UnderscoreMainExample} - - - - - {MainExample} - - - - -If you want to modify the Actor structure, you need to make sure that your Actor is executable as a module, via `python -m src`, as that is the command started by `apify run` in the Apify CLI. We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file. diff --git a/docs/01_overview/overview.mdx b/docs/01_overview/overview.mdx new file mode 100644 index 00000000..b64c74d5 --- /dev/null +++ b/docs/01_overview/overview.mdx @@ -0,0 +1,163 @@ +--- +title: Overview +sidebar_label: Overview +--- + +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; +import CodeBlock from '@theme/CodeBlock'; + +The Apify SDK for Python is the official library for creating [Apify Actors](https://docs.apify.com/platform/actors) in Python. + +```python +from apify import Actor +from bs4 import BeautifulSoup +import requests + +async def main(): + async with Actor: + input = await Actor.get_input() + response = requests.get(input['url']) + soup = BeautifulSoup(response.content, 'html.parser') + await Actor.push_data({ 'url': input['url'], 'title': soup.title.string }) +``` + +## Requirements + +The Apify SDK requires Python version 3.8 or above to run Python Actors locally. + +## Installation + +The Apify Python SDK is available as [`apify`](https://pypi.org/project/apify/) package on PyPi. To install it, run: + +```bash +pip install apify +``` + +When you create an Actor using the Apify CLI, the Apify SDK for Python is installed for you automatically. + +If you are not developing Apify Actors and you just need to access the Apify API from Python, +consider using the [Apify API client for Python](https://docs.apify.com/api/client/python) directly. + +## Quick start + +### Creating Actors + +To create and run Actors through Apify Console, refer to the [Console documentation](https://docs.apify.com/academy/getting-started/creating-actors#choose-your-template). + +To create a new Apify Actor on your computer, you can use the [Apify CLI](https://docs.apify.com/cli), and select one of the [Python Actor templates](https://apify.com/templates?category=python). + +For example, to create an Actor from the "[beta] Python SDK" template, you can use the [`apify create` command](https://docs.apify.com/cli/docs/reference#apify-create-actorname). + +```bash +apify create my-first-actor --template python-start +``` + +This will create a new folder called `my-first-actor`, download and extract the "Getting started with Python" Actor template there, create a virtual environment in `my-first-actor/.venv`, and install the Actor dependencies in it. + +### Running the Actor + +To run the Actor, you can use the [`apify run` command](https://docs.apify.com/cli/docs/reference#apify-run): + +```bash +cd my-first-actor +apify run +``` + +This will activate the virtual environment in `.venv` (if no other virtual environment is activated yet), then start the Actor, passing the right environment variables for local running, and configure it to use local storages from the `storage` folder. + +The Actor input, for example, will be in `storage/key_value_stores/default/INPUT.json`. + +## Actor structure + +All Python Actor templates follow the same structure. + +The `.actor` directory contains the [Actor configuration](https://docs.apify.com/platform/actors/development/actor-config), such as the Actor's definition and input schema, and the Dockerfile necessary to run the Actor on the Apify platform. + +The Actor's runtime dependencies are specified in the `requirements.txt` file, which follows the [standard requirements file format](https://pip.pypa.io/en/stable/reference/requirements-file-format/). + +The Actor's source code is in the `src` folder. This folder contains two important files: + +- `main.py` - which contains the main function of the Actor +- `__main__.py` - which is the entrypoint of the Actor package setting up the Actor [logger](../concepts/logging) and executing the Actor's main function via [`asyncio.run()`](https://docs.python.org/3/library/asyncio-runner.html#asyncio.run). + + + + { +`from apify import Actor +${''} +async def main(): + async with Actor: + Actor.log.info('Actor input:', await Actor.get_input()) + await Actor.set_value('OUTPUT', 'Hello, world!')` + } + + + { +`import asyncio +import logging +${''} +from apify.log import ActorLogFormatter +${''} +from .main import main +${''} +handler = logging.StreamHandler() +handler.setFormatter(ActorLogFormatter()) +${''} +apify_logger = logging.getLogger('apify') +apify_logger.setLevel(logging.DEBUG) +apify_logger.addHandler(handler) +${''} +asyncio.run(main())` + } + + + +If you want to modify the Actor structure, you need to make sure that your Actor is executable as a module, via `python -m src`, as that is the command started by `apify run` in the Apify CLI. +We recommend keeping the entrypoint for the Actor in the `src/__main__.py` file. + +## Adding dependencies + +First, add them in the [`requirements.txt`](https://pip.pypa.io/en/stable/reference/requirements-file-format/) file in the Actor source folder. + +Then activate the virtual environment in `.venv`: + + + + { +`source .venv/bin/activate` + } + + + { +`.venv\\Scripts\\activate` + } + + + +Then install the dependencies: + +```bash +python -m pip install -r requirements.txt +``` + +## Next steps + +### Guides + +To see how you can integrate the Apify SDK with some of the most popular web scraping libraries, check out our guides for working with: + +- [Requests or HTTPX](../guides/requests-and-httpx) +- [Beautiful Soup](../guides/beautiful-soup) +- [Playwright](../guides/playwright) +- [Selenium](../guides/selenium) +- [Scrapy](../guides/scrapy) + +### Usage concepts + +To learn more about the features of the Apify SDK and how to use them, check out the Usage Concepts section in the sidebar, especially the guides for: + +- [Actor lifecycle](../concepts/actor-lifecycle) +- [Working with storages](../concepts/storages) +- [Handling Actor events](../concepts/actor-events) +- [How to use proxies](../concepts/proxy-management)