Skip to content

Commit f4a99d2

Browse files
committed
fix: be specific about 'something'
1 parent 2162ca1 commit f4a99d2

File tree

1 file changed

+1
-1
lines changed
  • sources/academy/webscraping/web_scraping_for_beginners/crawling

1 file changed

+1
-1
lines changed

sources/academy/webscraping/web_scraping_for_beginners/crawling/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ In this section, we will take a look at moving between web pages, which we call
2020

2121
## How do you crawl? {#how-to-crawl}
2222

23-
Crawling websites is a fairly straightforward process. We'll start by opening the first web page and extracting all the links (URLs) that lead to the other pages we want to visit. To do that, we'll use the skills learned in the [Basics of data extraction](../data_extraction/index.md) course. We'll add some extra filtering to make sure we only get the correct URLs. Then, we'll save those URLs, so in case something happens to our scraper, we won't have to extract them again. And, finally, we will visit those URLs one by one.
23+
Crawling websites is a fairly straightforward process. We'll start by opening the first web page and extracting all the links (URLs) that lead to the other pages we want to visit. To do that, we'll use the skills learned in the [Basics of data extraction](../data_extraction/index.md) course. We'll add some extra filtering to make sure we only get the correct URLs. Then, we'll save those URLs, so in case our scraper crashes with an error, we won't have to extract them again. And, finally, we will visit those URLs one by one.
2424

2525
At any point, we can extract URLs, data, or both. Crawling can be separate from data extraction, but it's not a requirement and, in most projects, it's actually easier and faster to do both at the same time. To summarize, it goes like this:
2626

0 commit comments

Comments
 (0)