Skip to content

Commit 7ba0221

Browse files
authored
fix: allow calling Actor.reboot() from migrating handler, align reboot behavior with JS SDK (#361)
This fixes several issues with the `Actor.reboot()` behavior: - `Actor.reboot()` waits for all event handlers to finish, but if itself it was called in an event handler, it would be waiting for itself, getting into a deadlock - `Actor.reboot()` in the JS SDK triggers event handlers for the `migrating` and `persistState` events, but in the Python SDK it was triggering only the `persistState` handlers This aligns the behavior to work like the JS SDK, and prevents reboot getting into an infinite loop by allowing it to be called only once. Related PR in JS SDK: apify/apify-sdk-js#345
1 parent 2d4b8d0 commit 7ba0221

File tree

4 files changed

+30
-5
lines changed

4 files changed

+30
-5
lines changed

docs/03-concepts/04-actor-events.mdx

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ During its runtime, the Actor receives Actor events sent by the Apify platform o
4040
{' '}to another worker server soon.</p>
4141
You can use it to persist the state of the Actor so that once it is executed again on the new server,
4242
it doesn't have to start over from the beginning.
43+
Once you have persisted the state of your Actor, you can call <a href="../../reference/class/Actor#reboot"><code>Actor.reboot()</code></a>
44+
to reboot the Actor and trigger the migration immediately, to speed up the process.
4345
</td>
4446
</tr>
4547
<tr>

poetry.lock

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ cryptography = ">=42.0.0"
5151
# https://github.com/apify/apify-sdk-python/issues/348
5252
httpx = "~0.27.0"
5353
lazy-object-proxy = ">=1.10.0"
54+
more_itertools = ">=10.2.0"
5455
scrapy = { version = ">=2.11.0", optional = true }
5556
typing-extensions = ">=4.1.0"
5657
websockets = ">=10.0 <14.0.0"

src/apify/_actor.py

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,14 @@
77
from typing import TYPE_CHECKING, Any, Callable, TypeVar, cast
88

99
from lazy_object_proxy import Proxy
10+
from more_itertools import flatten
1011
from pydantic import AliasChoices
1112

1213
from apify_client import ApifyClientAsync
1314
from apify_shared.consts import ActorEnvVars, ActorExitCodes, ApifyEnvVars
1415
from apify_shared.utils import ignore_docs, maybe_extract_enum_member_value
1516
from crawlee import service_container
16-
from crawlee.events._types import Event, EventPersistStateData
17+
from crawlee.events._types import Event, EventMigratingData, EventPersistStateData
1718

1819
from apify._configuration import Configuration
1920
from apify._consts import EVENT_LISTENERS_TIMEOUT
@@ -48,6 +49,7 @@ class _ActorType:
4849
_apify_client: ApifyClientAsync
4950
_configuration: Configuration
5051
_is_exiting = False
52+
_is_rebooting = False
5153

5254
def __init__(
5355
self,
@@ -839,12 +841,32 @@ async def reboot(
839841
self.log.error('Actor.reboot() is only supported when running on the Apify platform.')
840842
return
841843

844+
if self._is_rebooting:
845+
self.log.debug('Actor is already rebooting, skipping the additional reboot call.')
846+
return
847+
848+
self._is_rebooting = True
849+
842850
if not custom_after_sleep:
843851
custom_after_sleep = self._configuration.metamorph_after_sleep
844852

845-
self._event_manager.emit(event=Event.PERSIST_STATE, event_data=EventPersistStateData(is_migrating=True))
853+
# Call all the listeners for the PERSIST_STATE and MIGRATING events, and wait for them to finish.
854+
# PERSIST_STATE listeners are called to allow the Actor to persist its state before the reboot.
855+
# MIGRATING listeners are called to allow the Actor to gracefully stop in-progress tasks before the reboot.
856+
# Typically, crawlers are listening for the MIIGRATING event to stop processing new requests.
857+
# We can't just emit the events and wait for all listeners to finish,
858+
# because this method might be called from an event listener itself, and we would deadlock.
859+
persist_state_listeners = flatten(
860+
(self._event_manager._listeners_to_wrappers[Event.PERSIST_STATE] or {}).values() # noqa: SLF001
861+
)
862+
migrating_listeners = flatten(
863+
(self._event_manager._listeners_to_wrappers[Event.MIGRATING] or {}).values() # noqa: SLF001
864+
)
846865

847-
await self._event_manager.__aexit__(None, None, None)
866+
await asyncio.gather(
867+
*[listener(EventPersistStateData(is_migrating=True)) for listener in persist_state_listeners],
868+
*[listener(EventMigratingData()) for listener in migrating_listeners],
869+
)
848870

849871
if not self._configuration.actor_run_id:
850872
raise RuntimeError('actor_run_id cannot be None when running on the Apify platform.')

0 commit comments

Comments
 (0)