Skip to content

debo024/web_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

web_crawler

The code base can be used to scrap any site a user wanted to find occurrences of any keyword or any phrase for finding the neutrality or popularity of any issues or events.

Steps needed:

  1. Download the repository including 3 files, .json file is not a sample so not needed.
  2. Execute the file first, testscrapy.py. This will create a .json file as scraped output of the site.
  3. Execute the second file, rmv_none_values.py. This will extract or filter out and give the count of the phrase or keyword which you have mentioned from the .json file.

The file testscrapy.py can be modified a little to get an automated crawler without boundaries and the file rmv_none_values.py can be modified to get optimized results as well.

About

This code base can be used to scrap any site a user wanted to find occurrences of any keyword or any phrase for finding the neutrality or popularity of any issues or events.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages