You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/academy/expert_scraping_with_apify/migrations_maintaining_state.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ paths:
8
8
9
9
# [](#migrations-maintaining-state) Migrations & maintaining state
10
10
11
-
We already know that actors are basically just Docker containers that can be run on any server. This means that they can be allocated anywhere there is space available, making them very efficient. Unfortunately, there is one big caveat: actors move - a lot. When an actor moves, it is called **migration**.
11
+
We already know that actors are basically just Docker containers that can be run on any server. This means that they can be allocated anywhere there is space available, making them very efficient. Unfortunately, there is one big caveat: actors move - a lot. When an actor moves, it is called a **migration**.
12
12
13
13
On migration, the process inside of an actor is completely restarted and everything in its memory is lost, meaning that any values stored within variables or classes are lost.
14
14
@@ -24,7 +24,7 @@ Before moving forward, read about actor [events](https://sdk.apify.com/docs/api/
24
24
25
25
1. Actors have an option the **Settings** tab to **Restart on error**. Would you use this feature for regular actors? When would you use this feature?
26
26
2. Migrations happen randomly, but by [aborting **gracefully**](https://docs.apify.com/actors/running#aborting-runs), you can simulate a similar situation. Try this out on the platform and observe what happens. What changes occur, and what remains the same for the restarted actor's run?
27
-
3. Why don't you (usually) need to add any special migration handling code for a standard crawling/scraping actor? Are there any features in the Apify SDK that handle this under the hood?
27
+
3. Why don't you (usually) need to add any special migration handling code for a standard crawling/scraping actor? Are there any features in the Crawlee/Apify SDK that handle this under the hood?
28
28
4. How can you intercept the migration event? How much time do you have after this event happens and before the actor migrates?
29
29
5. When would you persist data to the default key-value store instead of to a named key-value store?
The **persistState** event is automatically fired (by default) every 60 seconds by the Apify SDK while the actor is running, and is also fired when the **migrating** event occurs.
124
128
125
-
In order to persist our ASIN tracker object, let's use the `Apify.events.on` function to listen for the **persistState** event and store it in the key-value store each time it is emitted.
129
+
In order to persist our ASIN tracker object, let's use the `Actor.on` function to listen for the **persistState** event and store it in the key-value store each time it is emitted.
126
130
127
131
```JavaScript
128
132
// asinTracker.js
129
-
constApify=require('apify');
133
+
import { Actor } from'apify';
130
134
// We've updated our constants.js file to include the name
131
135
// of this new key in the key-value store
132
136
const { ASIN_TRACKER } =require('./constants');
@@ -135,8 +139,8 @@ class ASINTracker {
135
139
constructor() {
136
140
this.state= {};
137
141
138
-
Apify.events.on('persistState', async () => {
139
-
awaitApify.setValue(ASIN_TRACKER, this.state);
142
+
Actor.on('persistState', async () => {
143
+
awaitActor.setValue(ASIN_TRACKER, this.state);
140
144
});
141
145
142
146
setInterval(() =>console.log(this.state), 10000);
@@ -163,15 +167,15 @@ In order to fix this, let's create a method called `initialize` which will be ca
163
167
164
168
```JavaScript
165
169
// asinTracker.js
166
-
constApify=require('apify');
167
-
const { ASIN_TRACKER } =require('./constants');
170
+
import { Actor } from'apify';
171
+
import { ASIN_TRACKER } from'./constants';
168
172
169
173
classASINTracker {
170
174
constructor() {
171
175
this.state= {};
172
176
173
-
Apify.events.on('persistState', async () => {
174
-
awaitApify.setValue(ASIN_TRACKER, this.state);
177
+
Actor.on('persistState', async () => {
178
+
awaitActor.setValue(ASIN_TRACKER, this.state);
175
179
});
176
180
177
181
setInterval(() =>console.log(this.state), 10000);
@@ -180,7 +184,7 @@ class ASINTracker {
180
184
asyncinitialize() {
181
185
// Read the data from the key-value store. If it
182
186
// doesn't exist, it will be undefined
183
-
constdata=awaitApify.getValue(ASIN_TRACKER);
187
+
constdata=awaitActor.getValue(ASIN_TRACKER);
184
188
185
189
// If the data does exist, replace the current state
186
190
// (initialized as an empty object) with the data
@@ -200,18 +204,19 @@ class ASINTracker {
200
204
module.exports=newASINTracker();
201
205
```
202
206
203
-
We'll now call this function at the top level of the `Apify.main` function in **main.js** to ensure it is the first thing that gets called when the actor starts up:
207
+
We'll now call this function at the top level of the **main.js** file to ensure it is the first thing that gets called when the actor starts up:
204
208
205
209
```JavaScript
206
210
// main.js
207
211
208
212
// ...
209
-
consttracker=require('./src/asinTracker');
213
+
importtrackerfrom'./asinTracker';
210
214
211
-
const { log } =Apify.utils;
215
+
// The Actor.init() function should be executed before
216
+
// the tracker's initialization
217
+
awaitActor.init();
212
218
213
-
Apify.main(async () => {
214
-
awaittracker.initialize();
219
+
awaittracker.initialize();
215
220
// ...
216
221
```
217
222
@@ -227,13 +232,13 @@ That's everything! Now, even if the actor migrates (or is gracefully aborted the
227
232
228
233
**A:** After aborting or throwing an error mid-process, it manages to start back from where it was upon resurrection.
229
234
230
-
**Q: Why don't you (usually) need to add any special migration handling code for a standard crawling/scraping actor? Are there any features in the Apify SDK that handle this under the hood?**
235
+
**Q: Why don't you (usually) need to add any special migration handling code for a standard crawling/scraping actor? Are there any features in the Crawlee/Apify SDK that handle this under the hood?**
231
236
232
-
**A:** Because Apify SDK handles all of the migration handling code for us. If you want to add custom migration-handling code, you can use `Apify.events` to listen for the `migrating` or `persistState` events to save the current state in key-value store (or elsewhere).
237
+
**A:** Because Apify SDK handles all of the migration handling code for us. If you want to add custom migration-handling code, you can use `Actor.events` to listen for the `migrating` or `persistState` events to save the current state in key-value store (or elsewhere).
233
238
234
239
**Q: How can you intercept the migration event? How much time do you have after this event happens and before the actor migrates?**
235
240
236
-
**A:** By using the `Apify.events.on` function. You have a maximum of a few seconds before shutdown after the `migrating` event has been fired.
241
+
**A:** By using the `Actor.on` function. You have a maximum of a few seconds before shutdown after the `migrating` event has been fired.
237
242
238
243
**Q: When would you persist data to the default key-value store instead of to a named key-value store?**
0 commit comments