This project is a powerhouse for scraping and analyzing job listings from the Djinni job board. The scraping script efficiently collects job details, while the analysis script unleashes powerful visualizations for profound insights.
data_analysis/: Beholds scripts for data analysis.output_plots/: Stores all output plots
scrape/: Embodies the scraping scripts.scrapped_data/: Sanctuary for the sacred scraped job listings data.
- Selenium for web scraping, a modern-day wand.
- BeautifulSoup for the mystical art of HTML parsing.
- Pandas, the almighty deity of data manipulation.
- Logging, the ancient scrolls for recording project history.
-
Clone the repository:
git clone https://github.com/kostomeister/py-scrape-djinni.git
-
Install the required potions:
pip install -r requirements.txt
-
Execute the sacred scraping script:
python scrape/scrape.py
-
Summon the main analysis script:
python data_analysis/analysis.py
- The scraping script bestows job listings data in the
scrapped_datasanctuary. - The analysis script conjures various plots and saves them in an
output_plotsspellbook.
-
Tweak the constants in
config.pyto alter the magic within the scraping ritual.- Specializations and URLs
Define specializations and corresponding URLs for scraping:
SPECIALIZATION_URLS = { "python": BASE_URL + "?primary_keyword=Python", # Add other specializations as needed }
- Specializations and URLs
Define specializations and corresponding URLs for scraping:
-
Specify the path from where the scraped data will be taken for analysis:
PATH_TO_DATA = "scrapped_data/python_all.csv"
-
Embark on quests for additional analyses or visualizations based on your mystical inclinations.
- kostomeister
Special thanks to the Djinni job board for providing the enchanted job listings data.
Feel free to contribute, suggest improvements, or report issues! May your code be ever magical! ๐โจ# py-scrape-djinni