Skip to content

Long Apify API calls can block crawler #1132

@Pijukatel

Description

@Pijukatel

BasicCrawler calls fetch_next_request which under normal circumstances should fetch request within miliseconds, but sometimes can freeze and it can take tens of seconds.

Example normal request processing with injected timestamps for each task:

Image

Example of long wait for fetch_next_request:

Image

How to reproduce.
Use crawlee branch with injected timestamps
Create BeautifulSoup crawler from crawlee templates, push it to apify platform and let it run on with start page "https://warehouse-theme-metal.myshopify.com/" and max_requests_per_crawl=3000. It is not deterministic and I had to do several runs to capture the error. But it is also not some super rare glitch as it can be reproduced after few attempts.
Example run: https://console.apify.com/actors/C0lWh1UCQvgdArp6R/runs/xsqRrdd2s1f6HRY9N#log

Either fetch_next_request is taking too much time or something is blocking the event loop.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions