You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+27Lines changed: 27 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -202,6 +202,33 @@ Number, maximum amount of concurrent requests. Defaults to `Infinity`.
202
202
203
203
204
204
#### plugins
205
+
206
+
Plugins allow to extend scraper behaviour
207
+
208
+
*[Existing plugins](#existing-plugins)
209
+
*[Create plugin](#create-plugin)
210
+
* Create action
211
+
*[beforeStart](#beforestart)
212
+
*[afterFinish](#afterfinish)
213
+
*[error](#error)
214
+
*[beforeRequest](#beforerequest)
215
+
*[afterResponse](#afterresponse)
216
+
*[onResourceSaved](#onresourcesaved)
217
+
*[onResourceError](#onresourceerror)
218
+
*[saveResource](#saveresource)
219
+
*[generateFilename](#generatefilename)
220
+
*[getReference](#getreference)
221
+
222
+
##### Existing plugins
223
+
*[website-scraper-puppeteer](https://github.com/website-scraper/website-scraper-puppeteer) - download dynamic (rendered with js) websites using puppeteer
224
+
*[website-scraper-phantom](https://github.com/website-scraper/node-website-scraper-phantom) - download dynamic (rendered with js) websites using phantomJS
225
+
*[website-scraper-existing-directory](https://github.com/website-scraper/website-scraper-existing-directory) - save files to existing directory
226
+
*[request throttle](https://benjaminhorn.io/code/request-throttle-for-npm-package-website-scraper/) - add random timeout between requests
227
+
228
+
##### Create plugin
229
+
230
+
Note! Before creating new plugins consider using/extending [existing plugins](#existing-plugins).
231
+
205
232
Plugin is object with `.apply` method, can be used to change scraper behavior.
206
233
207
234
`.apply` method takes one argument - `registerAction` function which allows to add handlers for different actions. Action handlers are functions that are called by scraper on different stages of downloading website. For example `generateFilename` is called to generate filename for resource based on its url, `onResourceError` is called when error occured during requesting/handling/saving resource.
0 commit comments