You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -16,61 +13,79 @@ Node.js module for website's scraping with images, css, js, etc. Uses cheerio, r
16
13
##Usage
17
14
```javascript
18
15
var scraper =require('website-scraper');
19
-
scraper.scrape({
16
+
var options ={
20
17
url:'http://nodejs.org/',
21
-
path:'/path/to/save/',
22
-
}, function (error, result){
18
+
directory:'/path/to/save/',
19
+
};
20
+
21
+
// with callback
22
+
scraper.scrape(options, function (error, result) {
23
+
/* some code here */
24
+
});
25
+
26
+
// or with promise
27
+
scraper.scrape(options).then(function (result) {
23
28
/* some code here */
24
29
});
25
30
```
26
31
27
32
##API
28
33
### scrape(options, callback)
29
-
Makes request to `url` and saves all files found with `srcToLoad` to `path`.
34
+
Makes request to `url` and saves all files found with `srcToLoad` to `directory`.
30
35
31
36
**options** - object containing next options:
32
37
33
38
-`url:` url to load *(required)*
34
-
-`path:` path to save loaded files *(required)*
39
+
-`directory:` path to save loaded files *(required)*
40
+
-`paths:` array of objects, contains urls or relative paths to load and filenames for them (if is not set only `url` will be loaded) *(optional, see example below)*
35
41
-`log:` boolean indicates whether to write the log to console *(optional, default: false)*
36
42
-`indexFile:` filename for index page *(optional, default: 'index.html')*
37
43
-`srcToLoad:` array of objects to load, specifies selectors and attribute values to select files for loading *(optional, see default value in `lib/defaults.js`)*
38
-
-`directories:` array of objects, specifies relative directories for extensions. If `null` all files will be saved to `path`*(optional, see example below)*
44
+
-`subdirectories:` array of objects, specifies subdirectories for extensions. If `null` all files will be saved to `directory`*(optional, see example below)*
39
45
40
46
41
47
**callback** - callback function *(optional)*, includes following parameters:
42
48
43
49
-`error:` if error - `Error object`, if success - `null`
44
-
-`result:` if error - `null`, if success - object containing:
45
-
-`html:` html code of index page
50
+
-`result:` if error - `null`, if success - array if objects containing:
51
+
-`url:` url of loaded page
52
+
-`filename:` absolute filename where page was saved
46
53
47
54
48
55
##Examples
49
-
Let's scrape [http://nodejs.org/](http://nodejs.org/) with images, css, js files and save them to `/path/to/save/`. Index page will be named 'myIndex.html', files will be separated into directories:
56
+
Let's scrape some pages from [http://nodejs.org/](http://nodejs.org/) with images, css, js files and save them to `/path/to/save/`.
57
+
Imagine we want to load:
58
+
-[Home page](http://nodejs.org/) to `index.html`
59
+
-[About page](http://nodejs.org/about/) to `about.html`
60
+
-[Blog](http://blog.nodejs.org/) to `blog.html`
61
+
62
+
and separate files into directories:
50
63
51
-
-`img` for .jpg, .png (full path `/path/to/save/img`)
64
+
-`img` for .jpg, .png, .svg (full path `/path/to/save/img`)
52
65
-`js` for .js (full path `/path/to/save/js`)
53
66
-`css` for .css (full path `/path/to/save/css`)
54
-
-`font` for .ttf, .woff, .eot, .svg (full path `/path/to/save/font`)
0 commit comments