Skip to content

Commit 031839d

Browse files
committed
Lukas comments
1 parent 677e1ca commit 031839d

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

content/academy/api_scraping/general_api_scraping/handling_pagination.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ If we were to make a request with the **limit** set to **5** and the **offset**
3636

3737
Becoming more and more common is cursor-based pagination. Like with offset-based pagination, a **limit** parameter is usually present; however, instead of **offset**, **cursor** is used instead. A cursor is just a marker (sometimes a token, a date, or just a number) for an item in the dataset. All results returned back from the API will be records that come after the item matching the **cursor** parameter provided.
3838

39+
One of the most painful things about scraping APIs with cursor pagination is that you can't skip to, for example, the 5th page. You have to paginate through each page one by one.
40+
3941
> Note: SoundCloud [migrated](https://developers.soundcloud.com/blog/pagination-updates-on-our-api) over to using cursor-based pagination; however, they did not change the parameter name from **offset** to **cursor**. Always be on the lookout for this type of stuff!
4042
4143
## [](#using-next-page) Using "next page"
@@ -134,7 +136,9 @@ while (items.flat().length < 100) {
134136
}
135137
```
136138
137-
All that's left to do now is flesh out this `while` loop with pagination logic and finally return the **items** array once the loop has finished.:
139+
All that's left to do now is flesh out this `while` loop with pagination logic and finally return the **items** array once the loop has finished.
140+
141+
> Note that it's better to add requests to a requests queue rather than processing them in memory. The crawlers offered by the [Apify SDK](https://sdk.apify.com) provide this functionality out of the box.
138142
139143
```JavaScript
140144
// index.js
@@ -188,6 +192,10 @@ Here's what the output of this code looks like:
188192
105
189193
```
190194
195+
## [](#final-note) Final note
196+
197+
Sometimes, APIs have limited pagination. That means that they limit the total number of results that can appear for a set of pages, or that they limit the pages to a certain number. To learn how to handle these cases, take a look at [this short article](https://docs.apify.com/tutorials/scrape-paginated-sites).
198+
191199
## [](#next) Next up
192200
193201
<!-- In this lesson, you learned about how to use API parameters and properties returned in an API response to paginate through results. [Next up](link api_scraping/general_api_scraping/using_api_filters.md), you'll gain a solid understanding of using API filtering parameters. -->

0 commit comments

Comments
 (0)