dirbot-wordpress-link

This project you can scrape the website link as your wordpress link.

This project base on [email protected]:darkrho/dirbot-mysql.git.

##Items

The items scraped by this project are websites, and the item is defined in the class:

daf2e.items.Daf2EItem

link_category = scrapy.Field()
link_name = scrapy.Field()
link_url = scrapy.Field()
link_description = scrapy.Field()

##Spiders See the source code

note: time.sleep(1), if not, it would trigger sql query io error. This affects the wp_terms, wp_term_taxonomy and wp_term_relationships. if you have any good idea, give me a fork. keyword: multiple items per page, yield...

##Pipelines

###Requiring certain item fields

A pipeline to discard items that lack of certain fields. This pipeline is defined in the class::

daf2e.pipelines.RequiredFieldsPipeline

###Storing items in a MySQL database

A pipeline to store (insert or update) scraped items in a MySQL database. This pipeline is defined in the class:

daf2e.pipelines.MySQLStorePipeline

The database table include wp_links, wp_terms, wp_term_taxonomy and wp_term_relationships.

##Command scrapy crawl daqianduan

##Read more

http://doc.scrapy.org/en/latest/

[email protected]:scrapinghub/portia.git

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
daf2e		daf2e
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

dirbot-wordpress-link

About

Uh oh!

Releases

Packages

Languages

w3cub/dirbot-wordpress-link

Folders and files

Latest commit

History

Repository files navigation

dirbot-wordpress-link

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages