martinsbalodis · jwillmer · Jan 25, 2015 · Apr 2, 2015 · Jun 26, 2015 · Jun 26, 2015
diff --git a/.gitignore b/.gitignore
@@ -2,3 +2,9 @@
 projectFilesBackup
 extension.zip
 
+/.vs/web-scraper-chrome-extension/v15/.suo
+/.vs/web-scraper-chrome-extension/v15
+/.vs/VSWorkspaceState.json
+/.vs/slnx.sqlite
+/.vs/ProjectSettings.json
+/.vs/config/applicationhost.config
diff --git a/README.md b/README.md
@@ -5,35 +5,28 @@ should be traversed and what should be extracted. Using these sitemaps the
 Web Scraper will navigate the site accordingly and extract all data. Scraped 
 data later can be exported as CSV.
 
-Install the extension from [Chrome store] [chrome-store]
-
-### Features
-
- 1. Scrape multiple pages
- 2. Sitemaps and scraped data are stored in browsers local storage or in CouchDB
- 3. Multiple data selection types
- 4. Extract data from dynamic pages (JavaScript+AJAX)
- 5. Browse scraped data
- 6. Export scraped data as CSV
- 7. Import, Export sitemaps
- 8. Depends only on Chrome browser
-
-### Help
-
- Documentation and tutorials are available on [webscraper.io] [webscraper.io]
-
- Ask for help, submit bugs, suggest features on [google groups] [google-groups]
+#### Latest Version
+To run the latest version you need to [download the project][latest-releases] to your system and [follow the description on Google][get-started-chrome]) (select the `extension` folder).
 
- Submit bugs and suggest features on [bug tracker] [github-issues]
-
-#### Bugs
-When submitting a bug please attach an exported sitemap if possible.
-
-## License
-LGPLv3
-
 ## Changelog
 
+### v0.3
+ * Enabled pasting of multible start URLs (by [@jwillmer](https://github.com/jwillmer))
+ * Added scraping of dynamic table columns (by [@jwillmer](https://github.com/jwillmer))
+ * Added style extraction type (by [@jwillmer](https://github.com/jwillmer))
+ * Added text manipulation (trim, replace, prefix, suffix, remove HTML) (by [@jwillmer](https://github.com/jwillmer))
+ * Added image improvements to find images in div background (by [@jwillmer](https://github.com/jwillmer))
+ * Added support for vertical tables (by [@jwillmer](https://github.com/jwillmer))
+ * Added random delay function between requests (by [@Euphorbium](https://github.com/Euphorbium))
+ * Start URL can now also be a local URL (by [@3flex](https://github.com/3flex))
+ * Added CSV export options (by [@mohamnag](https://github.com/mohamnag))
+ * Added Regex group for select (by [@RuneHL](https://github.com/RuneHL))
+ * JSON export/import of settings (by [@haisi](https://github.com/haisi))
+ * Added date and number pattern in URL (by [@codoff](https://github.com/codoff))
+ * Added pagination selector limit (by [@codoff](https://github.com/codoff))
+ * Improved CSV export (by [@haisi](https://github.com/haisi))
+ * Added click limit option (by [@panna-ahmed](https://github.com/panna-ahmed))
+
 ### v0.2
  * Added Element click selector
  * Added Element scroll down selector
@@ -55,7 +48,19 @@ LGPLv3
  * Added ranged start urls
  * Fixed bug which made selector tree not to show on some operating systems
 
+#### Bugs
+When submitting a bug please attach an exported sitemap if possible.
+
+#### Development
+Read the [Development Instructions](/docs/Development.md) before you start.
+
+## License
+LGPLv3
+
  [chrome-store]: https://chrome.google.com/webstore/detail/web-scraper/jnhgnonknehpejjnehehllkliplmbmhn
  [webscraper.io]: http://webscraper.io/
  [google-groups]: https://groups.google.com/forum/#!forum/web-scraper
  [github-issues]: https://github.com/martinsbalodis/web-scraper-chrome-extension/issues
+ [get-started-chrome]: https://developer.chrome.com/extensions/getstarted#unpacked
+ [issue-14]: https://github.com/jwillmer/web-scraper-chrome-extension/issues/14
+ [latest-releases]: https://github.com/jwillmer/web-scraper-chrome-extension/releases
diff --git a/docs/Development.md b/docs/Development.md
@@ -0,0 +1,56 @@
+# Development Instructions
+
+## Selector Development
+
+This section demonstrates all steps that are needed in order to create or extend a selector for the web scraper. In this example we are creating a "Select All" selector.
+
+### Create Selector Logic
+You can skip the file creation steps if you intend to extend other selectors with functionallity.
+
+- Duplicate the file `SelectorElementStyle.js` in `scripts/Selector/`
+- Rename the duplicated file to `SelectorAll.js`
+- Modify the `getData` method to return all content
+- Specify which features you like to have enabled in the `getFeatures` function
+- Implement the logic for the enabled features (Feature `textmanipulation` will work out of the box)
+
+### Create Selector Controls
+
+- Add a section into the `SelectorEdit.html` file in `devtools/views/`
+- Add section class `form-group feature feature-AllSelector`
+- You can use `{{#selectorName}}` and `{{/selectorName}}` to prevent content from displaying (used for checkobx controls)
+- Use `{{selector.selectorAll}}` to define a variable
+
+
+### Set references to your selector
+
+#### Controler
+
+- Open the `Controler.js` in `scripts/`
+- Add a variable in the function `getCurrentlyEditedSelector` to select your HTML section value
+- Add the variable to the `newSelector` object (every selector in `scripts/Selector/` that references this feature can access the value)
+- Add validation rules to your variable in the function `initSelectorValidation`
+
+
+#### File reference
+
+- Add a reference in `extension/manifest.json` in the section `content_scripts` and `scripts`
+- Add a reference to `extension\devtools\devtools_scraper_panel.html`
+- Add a eference to `playgrounds\extension\index.html`
+- Add a reference to `tests\SpecRunner.html`
+
+
+### Testing
+
+For testing you need to run a web server. Personally I use [Web Server for Chrome](https://chrome.google.com/webstore/detail/web-server-for-chrome/ofhbbkphhbklhfoeikjpcbhemlocgigb) and reference the working directory of the project.
+
+- Duplicate a test file in `tests/Selector` and rename it
+- Write your tests for your selector
+- Run the tests by opening `tests/SpecRunner.html`
+- Try you implementation by opening `playgrounds/extension/index.html`
+- Extend the playground if it does not cover your scenario
+
+### Documentation
+
+- Create a `md` file in `docs/selectors`
+- Describe the usage, options, etc
+
diff --git a/docs/Selectors/Element attribute selector.md b/docs/Selectors/Element attribute selector.md
@@ -8,6 +8,11 @@ this link: `<a href="#" title="my title">link<a>`.
  * multiple - multiple records are being extracted.
  * attribute name - the attribute that is going to be extracted. For example
  `title`, `data-id`.
+ * remove HTML
+ * trim text
+ * replace text - regular expression in the replace field possible
+ * text prefix/suffix
+ * delay - delay the extraction
 
 ## Use cases
 See [Text selector] [text-selector] use cases.

diff --git a/docs/Selectors/Element click selector.md b/docs/Selectors/Element click selector.md
@@ -18,6 +18,7 @@ events triggered by the button.
  be clicked to load more elements.
  * click type - type of how the selector knows when there will be no new
  elements and clicking should stop.
+ * pagination limit - the number of clicks you want the selector to perform.
  * click element uniqueness - type of how selector knows which buttons are 
  already clicked.
  * multiple - multiple records are being extracted (almost always should be

diff --git a/docs/Selectors/Element scroll down selector.md b/docs/Selectors/Element scroll down selector.md
@@ -16,6 +16,7 @@ infinitely then this selector will be stuck in an infinite loop.
  should usually be specified because the data won't be loaded immediately from
  the server after scrolling down. More than 2000 ms might be a good choice if
  you you don't want to loose data because the server didn't respond fast enough.
+  * pagination limit - the number of clicks you want the selector to perform.
 
 ## Use cases
 See [Element selector] [element-selector] use cases.

diff --git a/docs/Selectors/Element selector.md b/docs/Selectors/Element selector.md
@@ -17,6 +17,7 @@ on a button then you should try these selectors:
  be used as parent elements for child selectors.
  * multiple - multiple records are being extracted (almost always should be
  checked). Multiple option for child selectors usually should not be checked.
+ * delay - delay the extraction
 
 ## Use cases
 

diff --git a/docs/Selectors/Element style selector.md b/docs/Selectors/Element style selector.md
@@ -0,0 +1,21 @@
+# Element style selector
+Element style selector can extract an style value of an HTML element.
+For example you could use this selector to extract the with attribute from
+this div: `<div style="width: 20px;"><div>`.
+
+## Configuration options
+ * selector - [CSS selector] [css-selector] for the element.
+ * multiple - multiple records are being extracted.
+ * style name - the attribute that is going to be extracted. For example
+ `width`, `background-image`.
+ * remove HTML
+ * trim text
+ * replace text - regular expression in the replace field possible
+ * text prefix/suffix
+ * delay - delay the extraction
+
+## Use cases
+See [Text selector] [text-selector] use cases.
+
+ [text-selector]: Text%20selector.md
+ [css-selector]: ../CSS%20selector.md
diff --git a/docs/Selectors/Grouped selector.md b/docs/Selectors/Grouped selector.md
@@ -9,6 +9,11 @@ The extracted data will be stored as JSON.
  * attribute name - optionally this selector can extract an attribute of the
  selected element. If specified the extractor will also add this attribute to
  the resulting JSON.
+ * remove HTML
+ * trim text
+ * replace text - regular expression in the replace field possible
+ * text prefix/suffix
+ * delay - delay the extraction
 
 ## Use cases
 

diff --git a/docs/Selectors/HTML selector.md b/docs/Selectors/HTML selector.md
@@ -6,6 +6,11 @@ inner HTML of the element will be extracted.
  * selector - [CSS selector] [css-selector] for the element whose inner HTML
  will be extracted.
  * multiple - multiple records are being extracted.
+ * remove HTML
+ * trim text
+ * replace text - regular expression in the replace field possible
+ * text prefix/suffix
+ * delay - delay the extraction
 
 ## Use cases
 See [Text selector] [text-selector] use cases.

diff --git a/docs/Selectors/Image selector.md b/docs/Selectors/Image selector.md
@@ -15,6 +15,7 @@ report it as a bug.
  checked for Image selector.
  * download image - downloads and store images on local drive. When CouchDB
  storage back end is used the image is also stored locally.
+ * delay - delay the extraction
 
 ## Use cases
 See [Text selector] [text-selector] use cases.

diff --git a/docs/Selectors/Link selector.md b/docs/Selectors/Link selector.md
@@ -23,6 +23,7 @@ link selector is not working for you then you can try these workarounds:
  * selector - [CSS selector] [css-selector] for the link element from which the
  link for navigation will be extracted.
  * multiple - multiple records are being extracted. Usually should be checked.
+ * delay - delay the extraction
 
 ## Use cases
 

diff --git a/docs/Selectors/Table selector.md b/docs/Selectors/Table selector.md
@@ -17,6 +17,7 @@ shows what you should select when extracting data from a table.
  * data rows selector - [CSS selector] [css-selector] for table data rows.
  * multiple - multiple records are being extracted. Usually should be
  checked for Table selector because you are extracting multiple rows.
+ * delay - delay the extraction
 
 ## Use cases
 See [Text selector] [text-selector] use cases.

diff --git a/docs/Selectors/Text selector.md b/docs/Selectors/Text selector.md
@@ -16,6 +16,11 @@ resulting data.
  multiple checked then you might actually need
  [Element selector] [element-selector].
  * regex - regular expression to extract a substring from the result.
+ * remove HTML
+ * trim text
+ * replace text - regular expression in the replace field possible
+ * text prefix/suffix
+ * delay - delay the extraction
 
 ### Regex