Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sources/academy/platform/getting_started/actors.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ After the Actor has completed its run (you'll know this when you see **SEO audit

## The "Actors" tab {#actors-tab}

While still on the platform, click on the tab with the **< >** icon which says **Actors**. This tab is your one-stop-shop for seeing which Actors you've used recently, and which ones you've developed yourself. You will be frequently using this tab when developing and testing on the Apify platform.
While still on the platform, click on the tab with the **&lt;/&gt;** icon which says **Actors**. This tab is your one-stop-shop for seeing which Actors you've used recently, and which ones you've developed yourself. You will be frequently using this tab when developing and testing on the Apify platform.

![The "Actors" tab on the Apify platform](./images/actors-tab.jpg)

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/platform/getting_started/apify_client.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ print(items)

## Updating an Actor {#updating-actor}

If you check the **Settings** tab within your **adding-actor**, you'll notice that the default memory being allocated to the Actor is **2048 MB**. This is a bit overkill considering the fact that the Actor is only adding two numbers together - **256 MB** would be much more reasonable. Also, we can safely say that the run should never take more than 20 seconds (even this is a generous number) and that the default of 3600 seconds is also overkill.
If you check the **Settings** tab within your **adding-actor**, you'll notice that the default timeout being set to the Actor is **360 seconds**. This is a bit overkill considering the fact that the Actor is only adding two numbers together - the run should never take more than 20 seconds (even this is a generous number). The default memory being allocated to the Actor is **256 MB**, which is reasonable for our purposes.

Let's change these two Actor settings via the Apify client using the [`actor.update()`](/api/client/js/reference/class/ActorClient#update) function. This function will call the **update Actor** endpoint, which can take `defaultRunOptions` as an input property. You can find the shape of the `defaultRunOptions` in the [API documentation](/api/v2/act-put). Perfect!

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ You can create an Actor in several ways. You can create one from your own source

## Choose the source {#choose-the-source}

Once you're in Apify Console, go to [Development](https://console.apify.com/actors/development/my-actors), and click on the **Develop new** button in the top right-hand corner.
Once you're in Apify Console, go to [Actors](https://console.apify.com/actors), and click on the **Develop new** button in the top right-hand corner.

![Develop an Actor button](./images/develop-new-actor.png)

Expand Down Expand Up @@ -169,12 +169,6 @@ After the Actor finishes, you can preview or download the extracted data by clic

And that's it! You've just created your first Actor and extracted data from a website 🎉.

## Getting stuck? Check out the tips 💡 {#get-help-with-tips}

If you ever get stuck, you can always click on the **Tips** button in the top right corner of the page. It will show you a list of tips that are relevant to the Actor development.

![Tips](./images/actor-tips.png)

## Next up {#next}

We've created an Actor, but how can we give it more complex inputs and make it do stuff based on these inputs? This is exactly what we'll be discussing in the [next lesson](./inputs_outputs.md)'s activity.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
9 changes: 5 additions & 4 deletions sources/academy/platform/getting_started/inputs_outputs.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,23 +71,24 @@ Finally, **Save** and **Build** the Actor just as you did in the previous lesson

## Configuring an Actor with inputs {#configuring}

If you scroll down a bit, you'll find the **Developer console** located under the multifile editor. By default, after running a build, the **Last build** tab will be selected, where you can see all of the logs related to building the Actor. Inputs can be configured within the **Input** tab.
By default, after running a build, the **Last build** tab will be selected, where you can see all of the logs related to building the Actor. Inputs can be configured within the **Input** tab.

![Configuring inputs](./images/configure-inputs.jpg)

Enter any two numbers you'd like, then press **Start**. The Actor's run should be completed almost immediately.

## View Actor results {#view-results}

Since we've pushed the result into the default dataset, it, and some info about it can be viewed by clicking this box, which will take you to the results tab:
Since we've pushed the result into the default dataset, it, and some info about it, can be viewed in two places inside the Last Run tab:

![Result box](./images/result-box.png)
1. **Export** button
2. **Storage** &rarr; **Dataset** (scroll below the main view)

On the results tab, there are a whole lot of options for which format to view/download the data in. Keep the default of **JSON** selected, and click on **Preview**.

![Dataset preview](./images/dataset-preview.png)

There's our solution! Did it work for you as well? Now, we can download the data right from the results tab to be used elsewhere, or even programmatically retrieve it by using [Apify's API](/api/v2) (we'll be discussing how to do this in the next lesson).
There's our solution! Did it work for you as well? Now, we can download the data right from the Dataset tab to be used elsewhere, or even programmatically retrieve it by using [Apify's API](/api/v2) (we'll be discussing how to do this in the next lesson).

It's important to note that the default dataset of the Actor, which we pushed our solution to, will be retained for 7 days. If we wanted the data to be retained for an indefinite period of time, we'd have to use a named dataset. For more information about named storages vs unnamed storages, read a bit about [data retention on the Apify platform](/platform/storage/usage#data-retention).

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ Remember that each key can only be used once per Git hosting service (GitHub, Bi

### Actor monorepos

To manage multiple Actors in a single repository, use the `dockerContextDix` property in the [Actor definition](/platform/actors/development/actor-definition/actor-json) to set the Docker context directory (if not provided then the repository root is used). In the Dockerfile, copy both the Actor's source and any shared code into the Docker image.
To manage multiple Actors in a single repository, use the `dockerContextDir` property in the [Actor definition](/platform/actors/development/actor-definition/actor-json) to set the Docker context directory (if not provided then the repository root is used). In the Dockerfile, copy both the Actor's source and any shared code into the Docker image.

To enable sharing Dockerfiles between multiple Actors, the Actor build process passes the `ACTOR_PATH_IN_DOCKER_CONTEXT` build argument to the Docker build.
It contains the relative path from `dockerContextDir` to the directory selected as the root of the Actor in the Apify Console (the "directory" part of the Actor's git URL).
Expand Down
6 changes: 3 additions & 3 deletions sources/platform/actors/running/usage_and_resources.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ The Actor has hard disk space limited by twice the amount of memory. For example

## Requirements

Actors built with [Crawlee](https://crawlee.dev/) use autoscaling. This means that they will always run as efficiently as they can based on the allocated memory. If you double the allocated memory, the run should be twice as fast and consume the same amount of compute units (1 * 1 = 0.5 * 2).
Actors built with [Crawlee](https://crawlee.dev/) use autoscaling. This means that they will always run as efficiently as they can based on the allocated memory. If you double the allocated memory, the run should be twice as fast and consume the same amount of [compute units](#what-is-a-compute-unit) (1 * 1 = 0.5 * 2).

A good middle ground is `4096MB`. If you need the results faster, increase the memory (bear in mind the [next point](#maximum-memory), though). You can also try decreasing it to lower the pressure on the target site.

Expand All @@ -65,7 +65,7 @@ Autoscaling only applies to solutions that run multiple tasks (URLs) for at leas
[//]: # (If you read that you can scrape 1000 pages of data for 1 CU and you want to scrape approximately 2 million of them monthly, that means you need 2000 CUs monthly and should [subscribe to the Business plan]&#40;https://console.apify.com/billing-new#/subscription&#41;.)


If the Actor doesn't have this information, or you want to use your own solution, just run your solution like you want to use it long term. Let's say that you want to scrape the data **every hour for the whole month**. You set up a reasonable memory allocation like `4096MB`, and the whole run takes 15 minutes. That should consume 1 CU (4 \* 0.25 = 1). Now, you just need to multiply that by the number of hours in the day and by the number of days in the month, and you get an estimated usage of 720 (1 \* 24 \* 30) CUs monthly.
If the Actor doesn't have this information, or you want to use your own solution, just run your solution like you want to use it long term. Let's say that you want to scrape the data **every hour for the whole month**. You set up a reasonable memory allocation like `4096MB`, and the whole run takes 15 minutes. That should consume 1 CU (4 \* 0.25 = 1). Now, you just need to multiply that by the number of hours in the day and by the number of days in the month, and you get an estimated usage of 720 (1 \* 24 \* 30) [compute units](#what-is-a-compute-unit) monthly.

:::tip Estimating usage

Expand Down Expand Up @@ -97,7 +97,7 @@ It's possible to [use multiple threads in Node.js-based Actor](https://dev.to/re

When you run an Actor it generates platform usage that's charged to the user account. Platform usage comprises four main parts:

- **Compute units**: CPU and memory resources consumed by the Actor.
- **[Compute units](#what-is-a-compute-unit)**: CPU and memory resources consumed by the Actor.
- **Data transfer**: The amount of data transferred between the web, Apify platform, and other external systems.
- **Proxy costs**: Residential or SERP proxy usage.
- **Storage operations**: Read, write, and other operations performed on the Key-value store, Dataset, and Request queue.
Expand Down
Loading