Switch proxies folder to mitigation instead

mstephen19 · mstephen19 · commit 1056e02ae943 · 2022-06-08T11:54:38.000+02:00
diff --git a/content/academy/anti_scraping/mitigation.md b/content/academy/anti_scraping/mitigation.md
@@ -0,0 +1,17 @@
+---
+title: Mitigation
+description: After learning about the various different anti-scraping techniques websites use, learn how to mitigate them with a few different techniques.
+menuWeight: 4.2
+paths:
+- anti-scraping/mitigation
+---
+
+# [](#anti-scraping-mitigation) Anti-scraping mitigation
+
+In the [techniques]({{@link anti_scraping/techniques.md}}) section of this course, you learned about multiple methods websites use to prevent bots from accessing their content. This **Mitigation** section will be all about how to circumvent these protections using various different techniques.
+
+<!-- Here there should  -->
+
+## [](#next) Next up
+
+In the [first lesson]({{@link anti_scraping/mitigation/proxies.md}}) of this section, you'll be learning about what proxies are and how to use them in your own crawler.
diff --git a/content/academy/anti_scraping/mitigation/images/proxy-info-logs.png b/content/academy/anti_scraping/mitigation/images/proxy-info-logs.png
diff --git a/content/academy/anti_scraping/mitigation/images/proxy-info-logs.webp b/content/academy/anti_scraping/mitigation/images/proxy-info-logs.webp
diff --git a/content/academy/anti_scraping/mitigation/proxies.md b/content/academy/anti_scraping/mitigation/proxies.md
@@ -1,9 +1,9 @@
 ---
 title: Proxies
 description: Learn all about proxies, how they work, and how they can be leveraged in a scraper to avoid blocking and other anti-scraping tactics.
-menuWeight: 4.2
+menuWeight: 1
 paths:
-- anti-scraping/proxies
+- anti-scraping/mitigation/proxies
 ---
 
 # [](#about-proxies) Proxies
@@ -45,4 +45,4 @@ Web scrapers can implement a method called "proxy rotation" to **rotate** the IP
 
 ## [](#next) Next up
 
-This module's first lesson will be teaching you how to configure your crawler in the Apify SDK to use and automatically rotate proxies. [Let's get right into it!]({{@link anti_scraping/proxies/using_proxies.md}})
+Proxies are one of the most important things to understand when it comes to mitigating anti-scraping techniques in a scraper. Now that you're familiar with what they are, the next lesson will be teaching you how to configure your crawler in the Apify SDK to use and automatically rotate proxies. [Let's get right into it!]({{@link anti_scraping/mitigation/using_proxies.md}})
diff --git a/content/academy/anti_scraping/mitigation/using_proxies.md b/content/academy/anti_scraping/mitigation/using_proxies.md
@@ -1,9 +1,9 @@
 ---
 title: Using proxies
 description: Learn how to use and automagically rotate proxies in your scrapers by using the Apify SDK, and a bit about how to easily obtain pools of proxies.
-menuWeight: 1
+menuWeight: 2
 paths:
-- anti-scraping/proxies/using-proxies
+- anti-scraping/mitigation/using-proxies
 ---
 
 # [](#using-proxies) Using proxies
@@ -120,7 +120,7 @@ const crawler = new Apify.CheerioCrawler({
 
 After modifying your code to log `proxyInfo` to the console and running the scraper, you're going to see some logs which look like this:
 
-![proxyInfo being logged by the scraper]({{@asset anti_scraping/proxies/images/proxy-info-logs.webp}})
+![proxyInfo being logged by the scraper]({{@asset anti_scraping/mitigation/images/proxy-info-logs.webp}})
 
 These logs confirm that our proxies are being used and rotated successfully by the Apify SDK, and can also be used to debug slow or broken proxies.
 
@@ -137,10 +137,6 @@ const proxyConfiguration = await Apify.createProxyConfiguration({
 
 Notice that we didn't provide it a list of proxy URLs. This is because the `SHADER` group already serves as our proxy pool (courtesy of Apify Proxy).
 
-## More lessons to come
+## [](#next) Next up
 
-That's it for the proxy course for now, but be on the lookout for future lessons! We release lessons as we write them, and will be updating the Academy frequently, so be sure to check back every once in a while for new content! Alternatively, you can subscribe to our mailing list to get periodic updates on the Academy, as well as what Apify is up to.
-
-<!-- ## [](#next) Next up
-
-Smth -->
+That's it for the **Mitigation** course for now, but be on the lookout for future lessons! We release lessons as we write them, and will be updating the Academy frequently, so be sure to check back every once in a while for new content! Alternatively, you can subscribe to our mailing list to get periodic updates on the Academy, as well as what Apify is up to.
diff --git a/content/academy/anti_scraping/techniques/captchas.md b/content/academy/anti_scraping/techniques/captchas.md
@@ -19,7 +19,7 @@ When you've hit a captcha, your first thought should not be how to programmatica
 
 Have you expended all of the possible options to make your scraper appear more human-like? Are you:
 
-- Using [proxies]({{@link anti_scraping/proxies.md}})?
+- Using [proxies]({{@link anti_scraping/mitigation/proxies.md}})?
 - Making the request with the proper [headers]({{@link concepts/http_headers.md}}) and [cookies]({{@link concepts/http_cookies.md}})?
 - Generating and using a custom [browser fingerprint]({{@link anti_scraping/techniques/fingerprinting.md}})?
 - Trying different general scraping methods (HTTP scraping, browser scraping)? If you are using browser scraping, have you tried using a different browser?
@@ -38,4 +38,4 @@ Another popular captcha is the [Geetest slider captcha](https://www.geetest.com/
 
 ## Wrap up
 
-In this course, you've learned about some of the most common (and some of the most advanced) anti-scraping techniques. Keep in mind that as the web (and technology in general) evolves, this section of the **Anti scraping** course will evolve as well. In the [next section]({{@link anti_scraping/proxies.md}}), we'll be discussing one of the most crucial parts of web scraping and web-automation: how to properly leverage proxies to avoid many of the anti-scraping techniques that were discussed in this section.
+In this course, you've learned about some of the most common (and some of the most advanced) anti-scraping techniques. Keep in mind that as the web (and technology in general) evolves, this section of the **Anti scraping** course will evolve as well. In the [next section]({{@link anti_scraping/mitigation.md}}), we'll be discussing how to mitigate the anti-scraping techniques you learned about in this section.
diff --git a/content/academy/anti_scraping/techniques/firewalls.md b/content/academy/anti_scraping/techniques/firewalls.md
@@ -30,7 +30,7 @@ Since there are multiple providers, it is essential to say that the challenges a
 
 ## [](#bypassing-firewalls) Bypassing web-application firewalls
 
-- Using [proxies]({{@link anti_scraping/proxies.md}}).
+- Using [proxies]({{@link anti_scraping/mitigation/proxies.md}}).
 - Mocking [headers]({{@link concepts/http_headers.md}}).
 - Overriding the browser's [fingerprint]({{@link anti_scraping/techniques/fingerprinting.md}}) (most effective).
 - Farming the [cookies]({{@link concepts/http_cookies.md}}) from a website with a headless browser, then using the farmed cookies to do HTTP based scraping (most performant).
diff --git a/content/academy/anti_scraping/techniques/geolocation.md b/content/academy/anti_scraping/techniques/geolocation.md
@@ -20,7 +20,7 @@ On targets which are just utilizing cookies and headers to identify the location
 
 The oldest (and still most common) way of geolocating is based on the IP address used to make the request. Sometimes, country-specific sites block themselves from being accessed from any other country (some Chinese, Indian, Israeli, and Japanese websites do this).
 
-[Proxies]({{@link anti_scraping/proxies.md}}) can be used in a scraper to bypass restrictions for make requests from a different location. Often times, proxies need to be used in combination with location-specific cookies/headers.
+[Proxies]({{@link anti_scraping/mitigation/proxies.md}}) can be used in a scraper to bypass restrictions for make requests from a different location. Often times, proxies need to be used in combination with location-specific [cookies]({{@link concepts/http_cookies.md}})/headers({{@link concepts/http_headers.md}}).
 
 ## [](#override-emulate-geolocation) Override/emulate geolocation when using a browser-based scraper
 
diff --git a/content/academy/anti_scraping/techniques/rate_limiting.md b/content/academy/anti_scraping/techniques/rate_limiting.md
@@ -12,11 +12,11 @@ When crawling a website, a web scraping bot will typically send many more reques
 
 In the past, most websites had their own anti-scraping solutions, the most common of which was IP address rate-limiting. In recent years, the popularity of third-party specialized anti-scraping providers has dramatically increased, but a lot of websites still use rate-limiting to only allow a certain number of requests per second/minute/hour to be sent from a single IP; therefore, crawler requests have the potential of being blocked entirely quite quickly.
 
-In cases when a higher number of requests is expected for the crawler, using a [proxy]({{@link anti_scraping/proxies.md}}) and rotating the IPs is essential to let the crawler run as smoothly as possible and avoid being blocked.
+In cases when a higher number of requests is expected for the crawler, using a [proxy]({{@link anti_scraping/mitigation/proxies.md}}) and rotating the IPs is essential to let the crawler run as smoothly as possible and avoid being blocked.
 
 ## [](#dealing-with-rate-limiting) Dealing rate limiting with proxy/session rotating
 
-The most popular and effective way of avoiding rate-limiting issues is by rotating [proxies]({{@link anti_scraping/proxies.md}}) after every **n** number of requests, which makes your scraper appear as if it is making requests from various different places. Since the majority of rate-limiting solutions are based on IP addresses, rotating IPs allows a scraper to make large amounts to a website without getting restricted.
+The most popular and effective way of avoiding rate-limiting issues is by rotating [proxies]({{@link anti_scraping/mitigation/proxies.md}}) after every **n** number of requests, which makes your scraper appear as if it is making requests from various different places. Since the majority of rate-limiting solutions are based on IP addresses, rotating IPs allows a scraper to make large amounts to a website without getting restricted.
 
 In the Apify SDK, proxies are automatically rotated for you when you use `proxyConfiguration` and a [**SessionPool**]((https://sdk.apify.com/docs/api/session-pool)) within a crawler. The SessionPool handles a lot of the nitty gritty of proxy rotating, especially with [browser based crawlers]({{@link puppeteer_playwright.md}}) by retiring a browser instance after a certain number of requests have been sent from it in order to use a new proxy (a browser instance must be retired in order to use a new proxy).
 
@@ -44,7 +44,7 @@ const myCrawler = new Apify.PuppeteerCrawler({
 });
 ```
 
-> Take a look at the [**Using proxies**]({{@link anti_scraping/proxies/using_proxies.md}}) lesson to learn more about how to use proxies and rotate them in the Apify SDK.
+> Take a look at the [**Using proxies**]({{@link anti_scraping/mitigation/using_proxies.md}}) lesson to learn more about how to use proxies and rotate them in the Apify SDK.
 
 ### [](#configuring-session-pool) Configuring a session pool
 
diff --git a/content/academy/expert_scraping_with_apify/apify_sdk.md b/content/academy/expert_scraping_with_apify/apify_sdk.md
@@ -22,7 +22,7 @@ The SDK factors away and manages the hard parts of the scraping/automation devel
 - Request concurrency
 - Queueing requests
 - Data storage
-- Using and rotating [proxies]({{@link anti_scraping/proxies.md}})
+- Using and rotating [proxies]({{@link anti_scraping/mitigation/proxies.md}})
 - Puppeteer/Playwright setup overhead
 - Plus much more!
 
diff --git a/content/academy/expert_scraping_with_apify/solutions/rotating_proxies.md b/content/academy/expert_scraping_with_apify/solutions/rotating_proxies.md
@@ -16,7 +16,7 @@ const proxyConfiguration = await Apify.createProxyConfiguration({
 });
 ```
 
-We didn't provide much explanation for this initially, as it was not directly relevant to the lesson at hand. When you [create a **ProxyConfiguration**]({{@link anti_scraping/proxies/using_proxies.md}}) and pass it to a crawler, the SDK will make the crawler automatically rotate through the proxies. This entire time, we've been using the **RESIDENTIAL** proxy group to avoid being blocked by Amazon.
+We didn't provide much explanation for this initially, as it was not directly relevant to the lesson at hand. When you [create a **ProxyConfiguration**]({{@link anti_scraping/mitigation/using_proxies.md}}) and pass it to a crawler, the SDK will make the crawler automatically rotate through the proxies. This entire time, we've been using the **RESIDENTIAL** proxy group to avoid being blocked by Amazon.
 
 > Go ahead and try commenting out the proxy configuration code then running the scraper. What happens?
 
diff --git a/content/academy/puppeteer_playwright/proxies.md b/content/academy/puppeteer_playwright/proxies.md
@@ -8,7 +8,7 @@ paths:
 
 # [](#using-proxies) Using proxies
 
-[Proxies]({{@link anti_scraping/proxies.md}}) are a great way of appearing as if you are making requests from a different location. A common use case for proxies is to avoid [geolocation]({{@link anti_scraping/techniques/geolocation.md}}) restrictions. For example your favorite TV show might not be available on Netflix in your country, but it might be available for Vietnamese Netflix watchers.
+[Proxies]({{@link anti_scraping/mitigation/proxies.md}}) are a great way of appearing as if you are making requests from a different location. A common use case for proxies is to avoid [geolocation]({{@link anti_scraping/techniques/geolocation.md}}) restrictions. For example your favorite TV show might not be available on Netflix in your country, but it might be available for Vietnamese Netflix watchers.
 
 In this lesson, we'll be learning how to use proxies with Playwright and Puppeteer. This will be demonstrated with a Vietnamese proxy that we got by running [this](https://apify.com/mstephen190/proxy-scraper) proxy-scraping actor on the Apify platform.
 
diff --git a/content/academy/web_scraping_for_beginners.md b/content/academy/web_scraping_for_beginners.md
@@ -31,7 +31,7 @@ This is what you'll learn in the **Web scraping for beginners** course:
   * [Basics of data collection]({{@link web_scraping_for_beginners/data_collection.md}})
   * [Basics of crawling]({{@link web_scraping_for_beginners/crawling.md}})
 
-Other courses and lessons (coming soon) in the Academy:
+<!-- Other courses and lessons (coming soon) in the Academy:
 
 * [API scraping]({{@link api_scraping.md}})
   * [General API Scraping]({{@link api_scraping/general_api_scraping.md}})
@@ -42,17 +42,17 @@ Other courses and lessons (coming soon) in the Academy:
     * [Introspection]({{@link api_scraping/graphql_scraping/introspection.md}})
     * [Modifying variables]({{@link api_scraping/graphql_scraping/modifying_variables.md}})
 * [Anti-scraping protections]({{@link anti_scraping.md}})
-  * [Proxies]({{@link anti_scraping/proxies.md}})
+  * [Proxies]({{@link anti_scraping/mitigation/proxies.md}})
   * Captchas and human behavior
   * Web application firewalls
 * Apify Platform
   * Getting started
   * Apify CLI
 * Expert topics
   * Mobile app scraping
-  * Overcoming result limits
+  * Overcoming result limits -->
 
-> We release course content as soon as we write it instead of launching it all at the same time much later. If you want to get notified about new lessons [sign up for a free Apify account](https://console.apify.com/sign-up?asrc=developers_portal) to get our newsletters.
+> We release course content as soon as we write it instead of launching it all at the same time much later. If you want to get notified about new lessons in the academy, [sign up for a free Apify account](https://console.apify.com/sign-up?asrc=developers_portal) to get our newsletters.
 
 ## [](#requirements) Requirements