You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: sources/academy/webscraping/scraping_basics_python/index.md
+44-17Lines changed: 44 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,30 +12,57 @@ import DocCardList from '@theme/DocCardList';
12
12
13
13
---
14
14
15
-
## What you'll learn
15
+
:::danger Work in progress
16
16
17
-
- Something
18
-
- Something
19
-
- Something
20
-
- Something
17
+
This course is incomplete. As we work on adding new lessons, we would love to hear your feedback. You can comment right here under each page or [file a GitHub Issue](https://github.com/apify/apify-docs/issues) to discuss a problem.
21
18
22
-
## Who this course is for
19
+
:::
23
20
24
-
- Somebody
25
-
- Somebody
26
-
- Somebody
21
+
## About this course
27
22
28
-
##Requirements
23
+
### What you'll learn
29
24
30
-
- Must have
31
-
- Must have
32
-
- Must have
33
-
- Must have
34
-
- Must have
25
+
- Inspecting pages using browser DevTools
26
+
- Downloading web pages using the HTTPX library
27
+
- Extracting data from web pages using the Beautiful Soup library
28
+
- Saving extracted data in various formats, e.g. CSV which MS Excel or Google Sheets can open
29
+
- Following links programatically (crawling)
30
+
- Saving time and effort with frameworks, such as Scrapy, and scraping platforms, such as Apify
Anyone with basic knowledge of developing programs in Python who wants to start with web scraping can take this course. The course does not expect you to have any prior knowledge of web technologies or scraping.
35
+
36
+
### Requirements
37
+
38
+
- macOS, Linux or Windows machine with a web browser and Python installed
- Comfortable using the Python standard library, virtual environments, and installing dependencies with `pip`
41
+
- Comfortable using command line tools (Terminal/Command Prompt)
42
+
43
+
## You may want to know
44
+
45
+
### Why learn scraping
46
+
47
+
The internet is full of useful data, but most of them isn't offered in a structured way, easy to process programatically. That's why you need scraping, a set of approaches how to download websites and extract data from them.
48
+
49
+
Scraper development is also a fun and challenging way to learn web development, web technologies, and understand the internet. You will reverse-engineer websites and understand how they work internally, what technologies they use and how they communicate with their servers. You will also master your chosen programming language and core programming concepts. Understanding web scraping gives you a head start in learning web technologies: HTML, CSS, JavaScript, frontend frameworks (such as React or Next.js), HTTP, REST APIs, GraphQL APIs, and more.
50
+
51
+
### Why build your own scrapers
52
+
53
+
Point-and-click or no-code scraping solutions take you only so far. While simple to use, they're not flexible or optimized enough to deal with advanced cases. Those can be tackled only by scrapers - custom-built programs specializing on mining data from the internet. And unlike with ready-made solutions, you can always fine-tune your scraper programs to do more, less, or the same, but faster or cheaper.
54
+
55
+
### Why become a scraper dev
56
+
57
+
As a scraper developer you are not limited by the fact if certain data is or isn't available programatically through an official API. The whole web is your API. What you can do with scraping?
58
+
59
+
- Improve your productivity by building small tools, such as your own real estate or sneakers watchdog.
60
+
- Companies can hire you to build custom scrapers mining data important for their business.
61
+
- You can publish your scrapers to a platform, such as the [Apify Store](https://apify.com/store), and let others to pay you a rent for using them.
62
+
63
+
### Why learn with Apify
64
+
65
+
We are [Apify](https://apify.com), a web scraping and automation platform, but we built this course on top of open source technologies. The skills you will learn are applicable to any scraping project, and you'll be able to run your scrapers on any computer. We will show you how scraping platforms can simplify your life, but those lessons are optional and they're designed to fit into our [free tier](https://apify.com/pricing).
0 commit comments