Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
be7dcf3
refactor: put old course to legacy folder
honzajavorek Oct 15, 2025
4e40c1f
refactor: update links to target the legacy folder
honzajavorek Oct 15, 2025
e244386
refactor: put new course to JS folder
honzajavorek Oct 15, 2025
586a3e7
chore: prepare redirects
honzajavorek Sep 8, 2025
faec8c6
feat: put the new course to the /scraping-basics-javascript/ URL
honzajavorek Sep 8, 2025
8cc5d97
fix: edit links to the JS course
honzajavorek Sep 8, 2025
d5be426
feat: change URLs of the legacy JS course
honzajavorek Sep 8, 2025
5d7613d
fix: leftover, correct URL for the course root
honzajavorek Sep 8, 2025
89153f5
feat: redirects from original URLs to the new JS course's lessons, wi…
honzajavorek Sep 8, 2025
d1449e3
feat: set new sidebar position
honzajavorek Sep 8, 2025
9944a4a
feat: unlist the legacy JS course
honzajavorek Sep 8, 2025
49bc198
feat: implement and use the LegacyJsCourseAdmonition component
honzajavorek Oct 14, 2025
17d4cda
fix: pretend that this is an updated example output
honzajavorek Oct 14, 2025
46b760e
fix: update various links leading to the old course
honzajavorek Oct 15, 2025
b9e1c0b
feat: denote that the old course is old
honzajavorek Oct 15, 2025
477a6dc
fix: do not set the old JS course as unlisted, set as noindex instead
honzajavorek Oct 15, 2025
5d3e89f
feat: publish the new JS course
honzajavorek Oct 15, 2025
edc0f60
style: make linters happier
honzajavorek Oct 15, 2025
a4e9a76
feat: add admonition to all old course pages
honzajavorek Oct 15, 2025
694df54
style: make Vale happy about Gzip
honzajavorek Oct 16, 2025
60fe380
style: make Vale happy about H1s
honzajavorek Oct 16, 2025
88d2b3e
style: make Vale happy about H1s
honzajavorek Oct 16, 2025
f863858
style: make Markdown lint happy
honzajavorek Oct 16, 2025
0319233
fix: typo
honzajavorek Nov 21, 2025
a02247e
style: raw markdown clarity
honzajavorek Nov 21, 2025
071a6f7
style: raw markdown clarity
honzajavorek Nov 21, 2025
b521a2b
style: raw markdown clarity
honzajavorek Nov 21, 2025
d4d9790
style: don't use question marks in headings
honzajavorek Nov 21, 2025
99c7c23
style: don't title case headings (consistency)
honzajavorek Nov 21, 2025
0164c83
fix: remove leftover from git rebase
honzajavorek Nov 21, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 30 additions & 2 deletions nginx.conf
Original file line number Diff line number Diff line change
Expand Up @@ -534,8 +534,36 @@ server {
rewrite ^/platform/actors/development/source-code$ /platform/actors/development/deployment/source-types redirect;

# Academy restructuring
rewrite ^academy/advanced-web-scraping/scraping-paginated-sites$ /academy/advanced-web-scraping/crawling/crawling-with-search permanent;
rewrite ^academy/php$ /academy/php/use-apify-from-php redirect; # not permanent in case we want to reuse /php in the future
rewrite ^/academy/advanced-web-scraping/scraping-paginated-sites$ /academy/advanced-web-scraping/crawling/crawling-with-search permanent;
rewrite ^/academy/php$ /academy/php/use-apify-from-php redirect; # not permanent in case we want to reuse /php in the future

# Academy: replacing the 'Web Scraping for Beginners' course
rewrite ^/academy/web-scraping-for-beginners/best-practices$ /academy/scraping-basics-javascript?legacy-js-course=/best-practices permanent;
rewrite ^/academy/web-scraping-for-beginners/introduction$ /academy/scraping-basics-javascript?legacy-js-course=/introduction permanent;
rewrite ^/academy/web-scraping-for-beginners/challenge/initializing-and-setting-up$ /academy/scraping-basics-javascript?legacy-js-course=/challenge/initializing-and-setting-up permanent;
rewrite ^/academy/web-scraping-for-beginners/challenge/modularity$ /academy/scraping-basics-javascript?legacy-js-course=/challenge/modularity permanent;
rewrite ^/academy/web-scraping-for-beginners/challenge/scraping-amazon$ /academy/scraping-basics-javascript?legacy-js-course=/challenge/scraping-amazon permanent;
rewrite ^/academy/web-scraping-for-beginners/challenge$ /academy/scraping-basics-javascript?legacy-js-course=/challenge permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/exporting-data$ /academy/scraping-basics-javascript/framework?legacy-js-course=/crawling/exporting-data permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/filtering-links$ /academy/scraping-basics-javascript/getting-links?legacy-js-course=/crawling/filtering-links permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/finding-links$ /academy/scraping-basics-javascript/getting-links?legacy-js-course=/crawling/finding-links permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/first-crawl$ /academy/scraping-basics-javascript/crawling?legacy-js-course=/crawling/first-crawl permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/headless-browser$ /academy/scraping-basics-javascript?legacy-js-course=/crawling/headless-browser permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/pro-scraping$ /academy/scraping-basics-javascript/framework?legacy-js-course=/crawling/pro-scraping permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/recap-extraction-basics$ /academy/scraping-basics-javascript/extracting-data?legacy-js-course=/crawling/recap-extraction-basics permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/relative-urls$ /academy/scraping-basics-javascript/getting-links?legacy-js-course=/crawling/relative-urls permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling/scraping-the-data$ /academy/scraping-basics-javascript/scraping-variants?legacy-js-course=/crawling/scraping-the-data permanent;
rewrite ^/academy/web-scraping-for-beginners/crawling$ /academy/scraping-basics-javascript/crawling?legacy-js-course=/crawling permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/browser-devtools$ /academy/scraping-basics-javascript/devtools-inspecting?legacy-js-course=/data-extraction/browser-devtools permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/computer-preparation$ /academy/scraping-basics-javascript/downloading-html?legacy-js-course=/data-extraction/computer-preparation permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/devtools-continued$ /academy/scraping-basics-javascript/devtools-extracting-data?legacy-js-course=/data-extraction/devtools-continued permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/node-continued$ /academy/scraping-basics-javascript/extracting-data?legacy-js-course=/data-extraction/node-continued permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/node-js-scraper$ /academy/scraping-basics-javascript/downloading-html?legacy-js-course=/data-extraction/node-js-scraper permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/project-setup$ /academy/scraping-basics-javascript/downloading-html?legacy-js-course=/data-extraction/project-setup permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/save-to-csv$ /academy/scraping-basics-javascript/saving-data?legacy-js-course=/data-extraction/save-to-csv permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction/using-devtools$ /academy/scraping-basics-javascript/devtools-locating-elements?legacy-js-course=/data-extraction/using-devtools permanent;
rewrite ^/academy/web-scraping-for-beginners/data-extraction$ /academy/scraping-basics-javascript/devtools-inspecting?legacy-js-course=/data-extraction permanent;
rewrite ^/academy/web-scraping-for-beginners$ /academy/scraping-basics-javascript?legacy-js-course=/ permanent;

# Removed pages
# GPT plugins were discontinued April 9th, 2024 - https://help.openai.com/en/articles/8988022-winding-down-the-chatgpt-plugins-beta
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Robotic process automation
title: What is robotic process automation (RPA)
description: Learn the basics of robotic process automation. Make your processes on the web and other software more efficient by automating repetitive tasks.
sidebar_position: 8.7
slug: /concepts/robotic-process-automation
Expand Down Expand Up @@ -29,7 +29,7 @@ With the advance of [machine learning](https://en.wikipedia.org/wiki/Machine_lea

## Is RPA the same as web scraping? {#is-rpa-the-same-as-web-scraping}

While [web scraping](../../webscraping/scraping_basics_javascript/index.md) is a kind of RPA, it focuses on extracting structured data. RPA focuses on the other tasks in browsers - everything except for extracting information.
While web scraping is a kind of RPA, it focuses on extracting structured data. RPA focuses on the other tasks in browsers - everything except for extracting information.

## Additional resources {#additional-resources}

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/glossary/tools/apify_cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ The [Apify CLI](/cli) helps you create, develop, build and run Apify Actors, and

## Installing {#installing}

To install the Apify CLI, you'll first need npm, which comes preinstalled with Node.js. If you haven't yet installed Node, [learn how to do that](../../webscraping/scraping_basics_javascript/data_extraction/computer_preparation.md). Additionally, make sure you've got an Apify account, as you will need to log in to the CLI to gain access to its full potential.
To install the Apify CLI, you'll first need npm, which comes preinstalled with Node.js. Additionally, make sure you've got an Apify account, as you will need to log in to the CLI to gain access to its full potential.

Open up a terminal instance and run the following command:

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/homepage_content.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"Beginner courses": [
{
"title": "Web scraping basics with JS",
"link": "/academy/web-scraping-for-beginners",
"link": "/academy/scraping-basics-javascript",
"description": "Learn how to use JavaScript to extract information from websites in this practical course, starting from the absolute basics.",
"imageUrl": "/img/academy/scraping-basics-javascript.svg"
},
Expand Down
Original file line number Diff line number Diff line change
@@ -1,19 +1,26 @@
---
title: I - Webhooks & advanced Actor overview
title: Webhooks & advanced Actor overview
description: Learn more advanced details about Actors, how they work, and the default configurations they can take. Also, learn how to integrate your Actor with webhooks.
sidebar_position: 6.1
sidebar_label: I - Webhooks & advanced Actor overview
slug: /expert-scraping-with-apify/actors-webhooks
---

**Learn more advanced details about Actors, how they work, and the default configurations they can take. Also, learn how to integrate your Actor with webhooks.**

:::caution Updates coming

This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency.

:::

---

Thus far, you've run Actors on the platform and written an Actor of your own, which you published to the platform yourself using the Apify CLI; therefore, it's fair to say that you are becoming more familiar and comfortable with the concept of **Actors**. Within this lesson, we'll take a more in-depth look at Actors and what they can do.

## Advanced Actor overview {#advanced-actors}

In this course, we'll be working out of the Amazon scraper project from the **Web scraping basics for JavaScript devs** course. If you haven't already built that project, you can do it in [three short lessons](../../webscraping/scraping_basics_javascript/challenge/index.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.
In this course, we'll be working out of the Amazon scraper project from the **Web scraping basics for JavaScript devs** course. If you haven't already built that project, you can do it in [three short lessons](../../webscraping/scraping_basics_legacy/challenge/index.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.

Take another look at the files within your Amazon scraper project. You'll notice that there is a **Dockerfile**. Every single Actor has a Dockerfile (the Actor's **Image**) which tells Docker how to spin up a container on the Apify platform which can successfully run the Actor's code. "Apify Actors" is a serverless platform that runs multiple Docker containers. For a deeper understanding of Actor Dockerfiles, refer to the [Apify Actor Dockerfile docs](/sdk/js/docs/guides/docker-images#example-dockerfile).

Expand All @@ -39,7 +46,7 @@ Prior to moving forward, please read over these resources:

## Our task {#our-task}

In this task, we'll be building on top of what we already created in the [Web scraping basics for JavaScript devs](/academy/web-scraping-for-beginners/challenge) course's final challenge, so keep those files safe!
In this task, we'll be building on top of what we already created in the [Web scraping basics for JavaScript devs](../../webscraping/scraping_basics_legacy/challenge/index.md) course's final challenge, so keep those files safe!

Once our Amazon Actor has completed its run, we will, rather than sending an email to ourselves, call an Actor through a webhook. The Actor called will be a new Actor that we will create together, which will take the dataset ID as input, then subsequently filter through all of the results and return only the cheapest one for each product. All of the results of the Actor will be pushed to its default dataset.

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: IV - Apify API & client
title: Apify API & client
description: Gain an in-depth understanding of the two main ways of programmatically interacting with the Apify platform - through the API, and through a client.
sidebar_position: 6.4
sidebar_label: IV - Apify API & client
slug: /expert-scraping-with-apify/apify-api-and-client
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: VI - Bypassing anti-scraping methods
title: Bypassing anti-scraping methods
description: Learn about bypassing anti-scraping methods using proxies and proxy/session rotation together with Crawlee and the Apify SDK.
sidebar_position: 6.6
sidebar_label: VI - Bypassing anti-scraping methods
slug: /expert-scraping-with-apify/bypassing-anti-scraping
---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,9 @@ Before developing a pro-level Apify scraper, there are some important things you

> If you've already gone through the [Web scraping basics for JavaScript devs](../../webscraping/scraping_basics_javascript/index.md) and the first courses of the [Apify platform category](../apify_platform.md), you will be more than well equipped to continue on with the lessons in this course.

<!-- ### Puppeteer/Playwright {#puppeteer-playwright}

[Puppeteer](https://pptr.dev/) is a library for running and controlling a [headless browser](../../webscraping/scraping_basics_javascript/crawling/headless_browser.md) in Node.js, and was developed at Google. The team working on it was hired by Microsoft to work on the [Playwright](https://playwright.dev/) project; therefore, many parallels can be seen between both the `puppeteer` and `playwright` packages. Proficiency in at least one of these will be good enough. -->

### Crawlee, Apify SDK, and the Apify CLI {#crawlee-apify-sdk-and-cli}

If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to [this lesson](../../webscraping/scraping_basics_javascript/crawling/pro_scraping.md) in the **Web scraping basics for JavaScript devs** course (and ideally follow along). To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.
If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to the [Using a scraping framework with Node.js](../../webscraping/scraping_basics_javascript/12_framework.md) lesson of the **Web scraping basics for JavaScript devs** course. To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.

The Apify CLI will play a core role in the running and testing of the Actor you will build, so if you haven't gotten it installed already, please refer to [this short lesson](../../glossary/tools/apify_cli.md).

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: II - Managing source code
title: Managing source code
description: Learn how to manage your Actor's source code more efficiently by integrating it with a GitHub repository. This is standard on the Apify platform.
sidebar_position: 6.2
sidebar_label: II - Managing source code
slug: /expert-scraping-with-apify/managing-source-code
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: V - Migrations & maintaining state
title: Migrations & maintaining state
description: Learn about what Actor migrations are and how to handle them properly so that the state is not lost and runs can safely be resurrected.
sidebar_position: 6.5
sidebar_label: V - Migrations & maintaining state
slug: /expert-scraping-with-apify/migrations-maintaining-state
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: VII - Saving useful run statistics
title: Saving useful run statistics
description: Understand how to save statistics about an Actor's run, what types of statistics you can save, and why you might want to save them for a large-scale scraper.
sidebar_position: 6.7
sidebar_label: VII - Saving useful run statistics
slug: /expert-scraping-with-apify/saving-useful-stats
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: V - Handling migrations
title: Handling migrations
description: Get real-world experience of maintaining a stateful object stored in memory, which will be persisted through migrations and even graceful aborts.
sidebar_position: 5
sidebar_label: V - Handling migrations
slug: /expert-scraping-with-apify/solutions/handling-migrations
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
---
title: I - Integrating webhooks
title: Integrating webhooks
description: Learn how to integrate webhooks into your Actors. Webhooks are a super powerful tool, and can be used to do almost anything!
sidebar_position: 1
sidebar_label: I - Integrating webhooks
slug: /expert-scraping-with-apify/solutions/integrating-webhooks
---

**Learn how to integrate webhooks into your Actors. Webhooks are a super powerful tool, and can be used to do almost anything!**

:::caution Updates coming

This lesson is subject to change because it currently relies on code from our archived **Web scraping basics for JavaScript devs** course. For now you can still access the archived course, but we plan to completely retire it in a few months. This lesson will be updated to remove the dependency.

:::

---

In this lesson we'll be writing a new Actor and integrating it with our beloved Amazon scraping Actor. First, we'll navigate to the same directory where our **demo-actor** folder lives, and run `apify create filter-actor` _(once again, you can name the Actor whatever you want, but for this lesson, we'll be calling the new Actor **filter-actor**)_. When prompted about the programming language, select **JavaScript**:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: II - Managing source
title: Managing source
description: View in-depth answers for all three of the quiz questions that were provided in the corresponding lesson about managing source code.
sidebar_position: 2
sidebar_label: II - Managing source
slug: /expert-scraping-with-apify/solutions/managing-source
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: VI - Rotating proxies/sessions
title: Rotating proxies/sessions
description: Learn firsthand how to rotate proxies and sessions in order to avoid the majority of the most common anti-scraping protections.
sidebar_position: 6
sidebar_label: VI - Rotating proxies/sessions
slug: /expert-scraping-with-apify/solutions/rotating-proxies
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: VII - Saving run stats
title: Saving run stats
description: Implement the saving of general statistics about an Actor's run, as well as adding request-specific statistics to dataset items.
sidebar_position: 7
sidebar_label: VII - Saving run stats
slug: /expert-scraping-with-apify/solutions/saving-stats
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: IV - Using the Apify API & JavaScript client
title: Using the Apify API & JavaScript client
description: Learn how to interact with the Apify API directly through the well-documented RESTful routes, or by using the proprietary Apify JavaScript client.
sidebar_position: 4
sidebar_label: IV - Using the Apify API & JavaScript client
slug: /expert-scraping-with-apify/solutions/using-api-and-client
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: III - Using storage & creating tasks
title: Using storage & creating tasks
description: Get quiz answers and explanations for the lesson about using storage and creating tasks on the Apify platform.
sidebar_position: 3
sidebar_label: III - Using storage & creating tasks
slug: /expert-scraping-with-apify/solutions/using-storage-creating-tasks
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: III - Tasks & storage
title: Tasks & storage
description: Understand how to save the configurations for Actors with Actor tasks. Also, learn about storage and the different types Apify offers.
sidebar_position: 6.3
sidebar_label: III - Tasks & storage
slug: /expert-scraping-with-apify/tasks-and-storage
---

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,6 @@ try {
}
```

Read more information about logging and error handling in our developer [best practices](../../webscraping/scraping_basics_javascript/best_practices.md) section.

### Saving snapshots {#saving-snapshots}

By snapshots, we mean **screenshots** if you use a [browser with Puppeteer/Playwright](../../webscraping/puppeteer_playwright/index.md) and HTML saved into a [key-value store](https://crawlee.dev/api/core/class/KeyValueStore) that you can display in your own browser. Snapshots are useful throughout your code but especially important in error handling.
Expand Down
Loading
Loading