Skip to content

Commit 8dea575

Browse files
committed
fix: use the new <Details /> component
I'd prefer using plain HTML, but I do this to prevent the error "Expected component `Details` to be defined: you likely forgot to import, pass, or provide it". As of now the component provides worse user experience than plain HTML, but that's another issue: #1199
1 parent 553c8c9 commit 8dea575

File tree

4 files changed

+26
-22
lines changed

4 files changed

+26
-22
lines changed

sources/academy/webscraping/scraping_basics_python/04_downloading_html.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ slug: /scraping-basics-python/downloading-html
77
---
88

99
import Exercises from './_exercises.mdx';
10+
import Details from '@theme/Details';
1011

1112
**In this lesson we'll start building a Python application for watching prices. As a first step, we'll use the HTTPX library to download HTML code of a product listing page.**
1213

@@ -148,7 +149,7 @@ Download HTML of a product listing page, but this time from a real world e-comme
148149
https://www.amazon.com/s?k=darth+vader
149150
```
150151

151-
<details>
152+
<Details>
152153
<summary>Solution</summary>
153154

154155
```py
@@ -161,7 +162,7 @@ https://www.amazon.com/s?k=darth+vader
161162
```
162163

163164
If you get `Server error '503 Service Unavailable'`, that's just Amazon's anti-scraping protections. You can learn about how to overcome those in our [Anti-scraping protections](../anti_scraping/index.md) course.
164-
</details>
165+
</Details>
165166

166167
### Save downloaded HTML as a file
167168

@@ -171,7 +172,7 @@ Download HTML, then save it on your disk as a `products.html` file. You can use
171172
https://warehouse-theme-metal.myshopify.com/collections/sales
172173
```
173174

174-
<details>
175+
<Details>
175176
<summary>Solution</summary>
176177

177178
Right in your Terminal or Command Prompt, you can create files by _redirecting output_ of command line programs:
@@ -192,7 +193,7 @@ https://warehouse-theme-metal.myshopify.com/collections/sales
192193
Path("products.html").write_text(response.text)
193194
```
194195

195-
</details>
196+
</Details>
196197

197198
### Download an image as a file
198199

@@ -202,7 +203,7 @@ Download a product image, then save it on your disk as a file. While HTML is _te
202203
https://warehouse-theme-metal.myshopify.com/cdn/shop/products/sonyxbr55front_f72cc8ff-fcd6-4141-b9cc-e1320f867785.jpg
203204
```
204205

205-
<details>
206+
<Details>
206207
<summary>Solution</summary>
207208

208209
Python offers several ways how to create files. The solution below uses [pathlib](https://docs.python.org/3/library/pathlib.html):
@@ -217,4 +218,4 @@ https://warehouse-theme-metal.myshopify.com/cdn/shop/products/sonyxbr55front_f72
217218
Path("tv.jpg").write_bytes(response.content)
218219
```
219220

220-
</details>
221+
</Details>

sources/academy/webscraping/scraping_basics_python/05_parsing_html.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ slug: /scraping-basics-python/parsing-html
77
---
88

99
import Exercises from './_exercises.mdx';
10+
import Details from '@theme/Details';
1011

1112
**In this lesson we'll look for products in the downloaded HTML. We'll use BeautifulSoup to turn the HTML into objects which we can work with in our Python program.**
1213

@@ -120,7 +121,7 @@ Print a total count of F1 teams listed on this page:
120121
https://www.formula1.com/en/teams
121122
```
122123

123-
<details>
124+
<Details>
124125
<summary>Solution</summary>
125126

126127
```py
@@ -136,13 +137,13 @@ https://www.formula1.com/en/teams
136137
print(len(soup.select(".outline")))
137138
```
138139

139-
</details>
140+
</Details>
140141

141142
### Scrape F1 drivers
142143

143144
Use the same URL as in the previous exercise, but this time print a total count of F1 drivers.
144145

145-
<details>
146+
<Details>
146147
<summary>Solution</summary>
147148

148149
```py
@@ -158,4 +159,4 @@ Use the same URL as in the previous exercise, but this time print a total count
158159
print(len(soup.select(".f1-grid")))
159160
```
160161

161-
</details>
162+
</Details>

sources/academy/webscraping/scraping_basics_python/06_locating_elements.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ slug: /scraping-basics-python/locating-elements
77
---
88

99
import Exercises from './_exercises.mdx';
10+
import Details from '@theme/Details';
1011

1112
**In this lesson we'll locate product data in the downloaded HTML. We'll use BeautifulSoup to find those HTML elements which contain details about each product, such as title or price.**
1213

@@ -214,7 +215,7 @@ Botswana
214215
...
215216
```
216217

217-
<details>
218+
<Details>
218219
<summary>Solution</summary>
219220

220221
```py
@@ -239,7 +240,7 @@ Botswana
239240

240241
Because some rows contain [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th), we skip processing a row if `table_row.select("td")` doesn't find any [table data](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td) cells.
241242

242-
</details>
243+
</Details>
243244

244245
### Use CSS selectors to their max
245246

@@ -248,7 +249,7 @@ Simplify the code from previous exercise. Use a single for loop and a single CSS
248249
- [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator)
249250
- [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child)
250251

251-
<details>
252+
<Details>
252253
<summary>Solution</summary>
253254

254255
```py
@@ -266,7 +267,7 @@ Simplify the code from previous exercise. Use a single for loop and a single CSS
266267
print(name_cell.select_one("a").text)
267268
```
268269

269-
</details>
270+
</Details>
270271

271272
### Scrape F1 news
272273

@@ -285,7 +286,7 @@ Max Verstappen wins Canadian Grand Prix: F1 – as it happened
285286
...
286287
```
287288

288-
<details>
289+
<Details>
289290
<summary>Solution</summary>
290291

291292
```py
@@ -303,4 +304,4 @@ Max Verstappen wins Canadian Grand Prix: F1 – as it happened
303304
print(title.text)
304305
```
305306

306-
</details>
307+
</Details>

sources/academy/webscraping/scraping_basics_python/07_extracting_data.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ slug: /scraping-basics-python/extracting-data
77
---
88

99
import Exercises from './_exercises.mdx';
10+
import Details from '@theme/Details';
1011

1112
**In this lesson we'll finish extracting product data from the downloaded HTML. With help of basic string manipulation we'll focus on cleaning and correctly representing the product price.**
1213

@@ -224,7 +225,7 @@ Denon AH-C720 In-Ear Headphones 236
224225
...
225226
```
226227

227-
<details>
228+
<Details>
228229
<summary>Solution</summary>
229230

230231
```py
@@ -259,13 +260,13 @@ Denon AH-C720 In-Ear Headphones 236
259260
print(title, units)
260261
```
261262

262-
</details>
263+
</Details>
263264

264265
### Use regular expressions
265266

266267
Simplify the code from previous exercise. Use [regular expressions](https://docs.python.org/3/library/re.html) to parse the number of units. You can match digits using a range like `[0-9]` or by a special sequence `\d`. To match more characters of the same type you can use `+`.
267268

268-
<details>
269+
<Details>
269270
<summary>Solution</summary>
270271

271272
```py
@@ -292,7 +293,7 @@ Simplify the code from previous exercise. Use [regular expressions](https://docs
292293
print(title, units)
293294
```
294295

295-
</details>
296+
</Details>
296297

297298
### Scrape publish dates of F1 news
298299

@@ -318,7 +319,7 @@ Hints:
318319
- In Python you can create `datetime` objects using `datetime.fromisoformat()`, a [built-in method for parsing ISO 8601 strings](https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat).
319320
- To get just the date part, you can call `.date()` on any `datetime` object.
320321

321-
<details>
322+
<Details>
322323
<summary>Solution</summary>
323324

324325
```py
@@ -343,4 +344,4 @@ Hints:
343344
print(title, published_on)
344345
```
345346

346-
</details>
347+
</Details>

0 commit comments

Comments
 (0)