Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion apify-docs-theme/src/theme/MDXComponents/Details.js
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ export default function MDXDetails(props) {
// Split summary item from the rest to pass it as a separate prop to the
// Details theme component
const summary = items.find(
(item) => React.isValidElement(item) && item.props?.mdxType === 'summary',
(item) => React.isValidElement(item) && item.type === 'summary',
);
const children = <>{items.filter((item) => item !== summary)}</>;
return (
Expand Down
2 changes: 1 addition & 1 deletion apify-docs-theme/src/theme/MDXComponents/index.js
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ const MDXComponents = {
code: MDXCode,
a: MDXA,
pre: MDXPre,
details: MDXDetails,
Details: MDXDetails,
ul: MDXUl,
img: MDXImg,
h1: (props) => <MDXHeading as="h1" {...props} />,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ slug: /scraping-basics-python/downloading-html
---

import Exercises from './_exercises.mdx';
import Details from '@theme/Details';

**In this lesson we'll start building a Python application for watching prices. As a first step, we'll use the HTTPX library to download HTML code of a product listing page.**

Expand Down Expand Up @@ -149,7 +148,7 @@ Download HTML of a product listing page, but this time from a real world e-comme
https://www.amazon.com/s?k=darth+vader
```

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -162,7 +161,7 @@ https://www.amazon.com/s?k=darth+vader
```

If you get `Server error '503 Service Unavailable'`, that's just Amazon's anti-scraping protections. You can learn about how to overcome those in our [Anti-scraping protections](../anti_scraping/index.md) course.
</Details>
</details>

### Save downloaded HTML as a file

Expand All @@ -172,7 +171,7 @@ Download HTML, then save it on your disk as a `products.html` file. You can use
https://warehouse-theme-metal.myshopify.com/collections/sales
```

<Details>
<details>
<summary>Solution</summary>

Right in your Terminal or Command Prompt, you can create files by _redirecting output_ of command line programs:
Expand All @@ -193,7 +192,7 @@ https://warehouse-theme-metal.myshopify.com/collections/sales
Path("products.html").write_text(response.text)
```

</Details>
</details>

### Download an image as a file

Expand All @@ -203,7 +202,7 @@ Download a product image, then save it on your disk as a file. While HTML is _te
https://warehouse-theme-metal.myshopify.com/cdn/shop/products/sonyxbr55front_f72cc8ff-fcd6-4141-b9cc-e1320f867785.jpg
```

<Details>
<details>
<summary>Solution</summary>

Python offers several ways how to create files. The solution below uses [pathlib](https://docs.python.org/3/library/pathlib.html):
Expand All @@ -218,4 +217,4 @@ https://warehouse-theme-metal.myshopify.com/cdn/shop/products/sonyxbr55front_f72
Path("tv.jpg").write_bytes(response.content)
```

</Details>
</details>
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ slug: /scraping-basics-python/parsing-html
---

import Exercises from './_exercises.mdx';
import Details from '@theme/Details';

**In this lesson we'll look for products in the downloaded HTML. We'll use BeautifulSoup to turn the HTML into objects which we can work with in our Python program.**

Expand Down Expand Up @@ -121,7 +120,7 @@ Print a total count of F1 teams listed on this page:
https://www.formula1.com/en/teams
```

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -137,13 +136,13 @@ https://www.formula1.com/en/teams
print(len(soup.select(".outline")))
```

</Details>
</details>

### Scrape F1 drivers

Use the same URL as in the previous exercise, but this time print a total count of F1 drivers.

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -159,4 +158,4 @@ Use the same URL as in the previous exercise, but this time print a total count
print(len(soup.select(".f1-grid")))
```

</Details>
</details>
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ slug: /scraping-basics-python/locating-elements
---

import Exercises from './_exercises.mdx';
import Details from '@theme/Details';

**In this lesson we'll locate product data in the downloaded HTML. We'll use BeautifulSoup to find those HTML elements which contain details about each product, such as title or price.**

Expand Down Expand Up @@ -215,7 +214,7 @@ Botswana
...
```

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -240,7 +239,7 @@ Botswana

Because some rows contain [table headers](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/th), we skip processing a row if `table_row.select("td")` doesn't find any [table data](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/td) cells.

</Details>
</details>

### Use CSS selectors to their max

Expand All @@ -249,7 +248,7 @@ Simplify the code from previous exercise. Use a single for loop and a single CSS
- [Descendant combinator](https://developer.mozilla.org/en-US/docs/Web/CSS/Descendant_combinator)
- [`:nth-child()` pseudo-class](https://developer.mozilla.org/en-US/docs/Web/CSS/:nth-child)

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -267,7 +266,7 @@ Simplify the code from previous exercise. Use a single for loop and a single CSS
print(name_cell.select_one("a").text)
```

</Details>
</details>

### Scrape F1 news

Expand All @@ -286,7 +285,7 @@ Max Verstappen wins Canadian Grand Prix: F1 – as it happened
...
```

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -304,4 +303,4 @@ Max Verstappen wins Canadian Grand Prix: F1 – as it happened
print(title.text)
```

</Details>
</details>
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ slug: /scraping-basics-python/extracting-data
---

import Exercises from './_exercises.mdx';
import Details from '@theme/Details';

**In this lesson we'll finish extracting product data from the downloaded HTML. With help of basic string manipulation we'll focus on cleaning and correctly representing the product price.**

Expand Down Expand Up @@ -225,7 +224,7 @@ Denon AH-C720 In-Ear Headphones 236
...
```

<Details>
<details>
<summary>Solution</summary>

```py
Expand Down Expand Up @@ -260,13 +259,13 @@ Denon AH-C720 In-Ear Headphones 236
print(title, units)
```

</Details>
</details>

### Use regular expressions

Simplify the code from previous exercise. Use [regular expressions](https://docs.python.org/3/library/re.html) to parse the number of units. You can match digits using a range like `[0-9]` or by a special sequence `\d`. To match more characters of the same type you can use `+`.

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -293,7 +292,7 @@ Simplify the code from previous exercise. Use [regular expressions](https://docs
print(title, units)
```

</Details>
</details>

### Scrape publish dates of F1 news

Expand All @@ -319,7 +318,7 @@ Hints:
- In Python you can create `datetime` objects using `datetime.fromisoformat()`, a [built-in method for parsing ISO 8601 strings](https://docs.python.org/3/library/datetime.html#datetime.datetime.fromisoformat).
- To get just the date part, you can call `.date()` on any `datetime` object.

<Details>
<details>
<summary>Solution</summary>

```py
Expand All @@ -344,4 +343,4 @@ Hints:
print(title, published_on)
```

</Details>
</details>
6 changes: 2 additions & 4 deletions sources/platform/console/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@ slug: /console

---

import Details from '@theme/Details';

## Sign-up

To use Apify Console, you first need to create an account. To create it please go to the [sign-up page](https://console.apify.com/sign-up).
Expand Down Expand Up @@ -95,7 +93,7 @@ Use the side menu to navigate other parts of Apify Console easily.

You can also navigate Apify Console via keyboard shortcuts.

<Details>
<details>
<summary>Keyboard Shortcuts</summary>

|Shortcut| Tab |
Expand All @@ -113,7 +111,7 @@ You can also navigate Apify Console via keyboard shortcuts.
|Settings| GS |
|Billing| GB |

</Details>
</details>

| Tab name | Description |
|:---|:---|
Expand Down