Job Scout is yet another homegrown application designed to monitor job sites for job postings, filter, and send alerts. What makes this one different from others is that it is built on distributed microservices. Why? Because I already have a python script to monitor for job changes, and I wanted to experiment and learn with Temporal orchestration.
This may seem like overkill, but my hope is that with this type of architecture, it might be easier for more contributors to add features in the future.
Future Ideas
- A UI
- Custom LLM resume, customized for each job
- LLM filtering step (not everything can be simple rules, though simple rules are fast)
Inspiration:
The system consists of several key components:
- Job-Scout: The main API and workflow orchestrator is in
job-scout/app - Temporal: Workflow orchestration and state management
- PostgreSQL: Primary data store
- Scraper: Scrapes sites for job postings, mainly high level information: e.g. Company, Title, Location, URL
- Duplicate Remover: Dedupes jobs that may be cross posted or repeated results
- Basic Filter: Eliminates jobs based on obvious filters such as keyword matching in Title or Company
- Job Detailer: Scrapes more detail about each job, gets the entire description
- Advanced Filter: Further eliminates jobs based on keyword matching in the desription itself.
- Smart Filter: Uses AI to further eliminate jobs which don't match what the user wants
- Notifier: Sends notifications to a webhook.
.
├── data/ # data folder containing the local database (Postgres)
├── job-scout/ # Main API and workflow orchestrator
│ ├── app/ # Main Job-Scout App
| │ ├── services/ # Modular services that run via workflow/activities with Temporal
| | │ ├── scrapers/ # Custom scrapers for each job source (LINKEDIN, INDEED, etc)
| | │ ├── duplicate_remover/ # Custom scrapers for each job source (LINKEDIN, INDEED, etc)
| | │ ├── duplicate_remover/ # Dedupes jobs (Duplicate Remover service)
| | │ ├── basic_filter/ # Filters jobs based on basic criteria (Basic Filter service)
| | │ ├── job_detailer/ # Scrapes detailed job descriptions (Job Detailer service)
| | │ ├── advanced_filter/ # Filters jobs based on description (Advanced Filter service)
| | │ ├── smart_filter/ # AI-powered filtering (Smart Filter service)
| | │ ├── notifier/ # Sends notifications to webhooks (Notifier service)
| │ ├── db/ # Database layer - schemas, models, etc
| │ ├── workflow.py # Main workflow logic (Temporal)
| │ ├── activities.py # Main activities logic (Temporal)
| │ └── api.py # Fast-API server for routing requests
├── config/ # Configuration files
└── docker-compose.yml # Service orchestration- Docker and Docker Compose
- Python 3.12+
- Make (for using Makefile commands)
-
Clone the repository
-
Configure the search settings manually (would love to develop a UI to manage the database) a. Universal Search Settings:
app/db/search_settings.jsonb. Scraper specific search settings:get_default_settings()class in each scraper.pyfile -
Configure
.envfile with personal settings, like notifier webhook, etc. -
Run the app with
make up -
Externally hit
localhost:8001/api/v0/runwhenever you want the search to happen (i.e. every 5 mins, etc)
-
Clone the repository
-
Set up environment variables:
cp controller/.env.example controller/.env
# Edit .env with your configuration- Start the services:
make up- Run database migrations:
make migrateThe project includes several useful make commands for development and operations:
make up # Build and start all services
make down # Stop all services
make restart # Restart all services
make build # Build service images
make logs # Follow service logs
make clean # Stop services and clean up volumes/imagesmake install-<svc> # Install dependencies for a specific service
make shell-<svc> # Open a shell for a specific service
make install-all # Install dependencies for all services
make pkg-install-<svc>-<pkg> # Install a package in a specific service
make pkg-install-all-<pkg> # Install a package in all servicesmake ruff # Run ruff formatter and linter for all servicesmake connect-db # Connect to PostgreSQL databaseThe project uses Alembic for database migrations. Here are the available migration commands:
# Generate a new migration
make migrate-new
# You'll be prompted to enter a migration message
# Example: "add clip metadata table"
# Apply all pending migrations
make migrate-up
# Rollback the last migration
make migrate-down
# Show current migration status and history
make migrate-status# Reset all migrations (WARNING: This will delete all data!)
make migrate-reset
# You'll be prompted to confirm the action
# Stamp the database with a specific revision
make migrate-stamp
# You'll be prompted to enter the revision ID- Creating a new migration:
# 1. Make your model changes in the code
# 2. Generate a new migration
make migrate-new
# Enter message: "add user preferences table"
# 3. Review the generated migration file in controller/alembic/versions/
# 4. Apply the migration
make migrate-up- Rolling back changes:
# If you need to undo the last migration
make migrate-down
# To check the current state
make migrate-status- Development workflow:
# 1. Start fresh (WARNING: deletes all data)
make migrate-reset
# 2. Create new migration
make migrate-new
# Enter message: "add clip processing status"
# 3. Apply migration
make migrate-up
# 4. Verify status
make migrate-status- Stamping a specific version:
# Useful when setting up a new environment
# or syncing with a specific database state
make migrate-stamp
# Enter revision: "a1b2c3d4e5f6"Note: Always backup your database before running migration commands, especially
migrate-resetwhich will delete all data.
tbd
You can customize the API endpoint by setting the API_HOST variable:
make API_HOST=other-host:8001 <workflow>- Create a new service directory in
services/ - Copy the service template structure
- Add the service to
docker-compose.yml - Update the controller workflow and activities as needed
Once the services are running, access the API documentation at:
- Swagger UI: http://localhost:8001/controller/docs
- ReDoc: http://localhost:8001/controller/redoc
- Temporal UI: http://localhost:8082
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
tbd