Job Scraping Platform

A Django-based platform for scraping and managing job listings from various sources.

Features

Multi-source job scraping (cv.ee, LinkedIn, etc.)
Job listing management and filtering
Export functionality (CSV, Excel)
Scraper management interface
Real-time job status updates
Email notifications
Telegram bot integration

Requirements

Python 3.8+
Django 3.2+
Redis
Celery
PostgreSQL (recommended) or SQLite
Chrome/Chromium (for Selenium scrapers)

Installation

Clone the repository:

git clone https://github.com/yourusername/jobs_scraping.git
cd jobs_scraping

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

Run migrations:

python manage.py migrate

Create a superuser:

python manage.py createsuperuser

Start Redis server:

redis-server

Start Celery worker:

celery -A config worker -l info

Start Celery beat:

celery -A config beat -l info

Run the development server:

python manage.py runserver

Configuration

Environment Variables

Create a .env file with the following variables:

DEBUG=True
SECRET_KEY=your-secret-key
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_HOST_USER=[email protected]
EMAIL_HOST_PASSWORD=your-app-password
TELEGRAM_BOT_TOKEN=your-bot-token

Scraper Configuration

Scraper settings can be configured in config/settings.py:

SCRAPER_CONFIG = {
    'cv_ee': {
        'base_url': 'https://www.cv.ee',
        'search_url': 'https://www.cv.ee/toopakkumised',
        'max_pages': 10,
    },
    'linkedin': {
        'base_url': 'https://www.linkedin.com',
        'search_url': 'https://www.linkedin.com/jobs/search',
        'max_pages': 5,
    },
}

Usage

Running Scrapers

Access the scraper management interface at /scrapers/
Click "Run Now" to start a specific scraper
Use "Run All Scrapers" to start all configured scrapers

Viewing Jobs

Access the job list at /
Use filters to find specific jobs
Click on a job to view details

Exporting Data

Access the export options at /export/csv/ or /export/excel/
Download the file in your preferred format

Development

Project Structure

jobs_scraping/
├── apps/
│   ├── accounts/
│   └── scraping/
│       ├── management/
│       ├── scrapers/
│       ├── templates/
│       └── tests/
├── config/
├── logs/
├── media/
├── static/
└── templates/

Adding New Scrapers

Create a new scraper class in apps/scraping/scrapers/
Implement the required methods:
- __init__
- run
- parse_job
Add the scraper to SCRAPER_CONFIG in settings
Register the scraper in apps/scraping/tasks.py

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.idea		.idea
apps		apps
config		config
development		development
static		static
staticfiles		staticfiles
templates		templates
.gitignore		.gitignore
FEATURES_GUIDE.md		FEATURES_GUIDE.md
LINKEDIN_INTEGRATION.md		LINKEDIN_INTEGRATION.md
README.md		README.md
analyze_cv.py		analyze_cv.py
cv_ee_debug.html		cv_ee_debug.html
cv_ee_jobs.json		cv_ee_jobs.json
cv_ee_search_results.html		cv_ee_search_results.html
cvkeskus_api_response.json		cvkeskus_api_response.json
cvkeskus_debug.html		cvkeskus_debug.html
cvkeskus_debug_20250619_234538.html		cvkeskus_debug_20250619_234538.html
cvkeskus_debug_20250619_234552.html		cvkeskus_debug_20250619_234552.html
cvkeskus_debug_20250619_235142.html		cvkeskus_debug_20250619_235142.html
cvkeskus_debug_20250619_235155.html		cvkeskus_debug_20250619_235155.html
cvkeskus_debug_20250619_235157.html		cvkeskus_debug_20250619_235157.html
cvkeskus_debug_20250619_235203.html		cvkeskus_debug_20250619_235203.html
cvkeskus_it_10_jobs.json		cvkeskus_it_10_jobs.json
cvkeskus_jobs_20250619_235303.json		cvkeskus_jobs_20250619_235303.json
cvkeskus_jobs_20250620_000247.json		cvkeskus_jobs_20250620_000247.json
cvkeskus_jobs_20250620_000504.json		cvkeskus_jobs_20250620_000504.json
cvkeskus_jobs_test_20250620_002847.json		cvkeskus_jobs_test_20250620_002847.json
cvkeskus_jobs_test_20250620_004230.json		cvkeskus_jobs_test_20250620_004230.json
cvkeskus_sample.html		cvkeskus_sample.html
djinni.txt		djinni.txt
example_cv_ee_selenium.py		example_cv_ee_selenium.py
jobs.db		jobs.db
linkedin_jobs_debug.json		linkedin_jobs_debug.json
manage.py		manage.py
page_source.html		page_source.html
page_source_after_load_fail.html		page_source_after_load_fail.html
page_source_after_search_fail.html		page_source_after_search_fail.html
process_jobs.py		process_jobs.py
requirements.txt		requirements.txt
run_cv_scraping.py		run_cv_scraping.py
send_emails.py		send_emails.py
test_celery.py		test_celery.py
test_chromedriver.py		test_chromedriver.py
test_cvkeskus_api.py		test_cvkeskus_api.py
test_cvkeskus_fixed.py		test_cvkeskus_fixed.py
test_cvkeskus_it_10.py		test_cvkeskus_it_10.py
test_cvkeskus_scraper.py		test_cvkeskus_scraper.py
test_linkedin_scraper.py		test_linkedin_scraper.py
test_scraper.py		test_scraper.py
work.py		work.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Job Scraping Platform

Features

Requirements

Installation

Configuration

Environment Variables

Scraper Configuration

Usage

Running Scrapers

Viewing Jobs

Exporting Data

Development

Project Structure

Adding New Scrapers

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

visualGravitySense/jobs_scraping

Folders and files

Latest commit

History

Repository files navigation

Job Scraping Platform

Features

Requirements

Installation

Configuration

Environment Variables

Scraper Configuration

Usage

Running Scrapers

Viewing Jobs

Exporting Data

Development

Project Structure

Adding New Scrapers

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages