Autoscaling based on multiple cpu utilization for single process crawlers?

Currently the Autoscaled pool will try to scale up if the the cpu utilization is low. The problem can happen in situation where for example some http based crawler (basically single process crawler) runs in environment with multiple cpus. The other cpus will be underutilized and this will be reported to Autoscaled pool which can try to scale up (even though the relevant core is already fully utilized.)  

This is probably not such a problem for any browser based crawler as the browsers are running in their own processes and can run on different cores.

Mentioned here: https://github.com/apify/apify-sdk-python/pull/447#issuecomment-2757356744

Maybe we need more detailed information about the utilization so that the each crawler can decide what is relevant for it.
(Or possibly make crawlee in general capable of scaling up to multiple cpus?)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Autoscaling based on multiple cpu utilization for single process crawlers? #1119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Autoscaling based on multiple cpu utilization for single process crawlers? #1119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions