Scrapy Playground

Example project with an implementation of the Scrapy Framework executed from code, handling code and request errors, and exporting the extracted data to a CSV file.

Description

The execution is in the app.py file. It initializes the crawler with the options of the spider_nest/settings.py file, load the spiders and execute them.

The spiders extends the class SpiderHandler of the spider_nest/spider_handler.py file, which has methods to handle code and request errors, and some variables for generate statistics.

When a spider returns an object, it's catch by the process_item() function of the spider_nest/pipelines.py file, where it's written to a CSV file in the root of the project.

If an spider raises an error, it's handled by the SpiderHandler, and all following requests are turned down by the DownloaderMiddleware of the spider_nest/middlewares.py file, to prevent extract incomplete results (This can change according to the needs of each spider).

When the spider finish its execution, it's executed the close_spider() function of the spider_nest/pipelines.py file, where the statistics of the spider execution are printed.

Installation

Clone the repository

git clone https://github.com/dmarcosl/scrapy-playground

Create a virtual environment and activate it

cd scrapy_playground

python3 -m venv venv

. venv/bin/activate

Install the Scrapy library

pip3 install -r requirements.txt

or

pip3 install scrapy==1.6.0

Execute it

python3 app.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
spider_nest		spider_nest
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scrapy Playground

Description

Installation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

dmarcosl/scrapy-playground

Folders and files

Latest commit

History

Repository files navigation

Scrapy Playground

Description

Installation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages