scraper/README.md at master · meads2/scraper

Scraper

A library for performing simple web scraping of a search engine's results page for data analysis tasks. (Note: For personal non-commercial use only. Follow all web scraping guidelines, before getting started. Be kind to servers.)

Requies Python version 3.6 or greater.

Getting Started

This library is intended for personal use only to get search results from a search engine for downstream analysis.

1. Clone Project Repo

git clone https://github.com/meads2/scraper.git
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pwd

2. Enter search terms to scrape results

python scraper 'my favorite team'

You can use additional flags for various functionality if desired, some default assumptions are assumed.

Parameters

terms - String value of search terms to pass to scraper engine. (ex. 'Python Tips and Tricks')

--selfie - If present selenium will take a screenshot of the browser search window returned.

--dest (FUTURE) - If specified will save results to defined location

--showme (FUTURE) - If present browser window will open at runtime to see execution, useful for debugging.

--engine (FUTURE) - If specified will use that search engine, defaults to Google. ['Bing' - Microsoft Bing, 'duck' - DuckDuckGo, 'google' - Google, 'Yahoo'-Yahoo]

Examples

Basic Example

python scraper 'daily news near me'
### ... running and scraping quietly
### Check your downloads for a surprise!

Screenshot Example

python scraper 'daily news near me' --selfie
### ... running and scraping quietly
### Check your downloads for a surprise!

Verbose Example

python scraper 'daily news near me' --showme 
### ... running and scraping right before your eyes
### Check your downloads for a surprise

Custom Save Example

python scraper 'daily news near me' --dest '../some/location/'
### ... running and scraping quietly to your defined location
### Check your downloads for a surprise!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scraper

Getting Started

1. Clone Project Repo

2. Enter search terms to scrape results

Parameters

Examples

Basic Example

Screenshot Example

Verbose Example

Custom Save Example

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Scraper

Getting Started

1. Clone Project Repo

2. Enter search terms to scrape results

Parameters

Examples

Basic Example

Screenshot Example

Verbose Example

Custom Save Example