🦊 Flyfox Job Matching Engine

A machine learning-driven job matching engine that connects applicants with the most suitable job opportunities using natural language processing, text embeddings, and structured data features.

🚀 Key Features

🔄 Data Ingestion: Load applicant profiles, job descriptions, and labeled pairs
🧠 Feature Engineering: Combine text-based embeddings and structured metadata (location, experience, skills)
🎯 Model Training: Train predictive models using logistic regression, XGBoost, and LightGBM
📈 Prediction: Rank jobs for applicants or find best-fit candidates for positions
🌐 API Integration: Serve predictions via FastAPI (optional)

🛠️ Installation

1. Clone the Repository

git clone https://github.com/theflyfoxX/flyfox-job-matching.git
cd flyfox-job-matching

2. Create Virtual Environment

# Create virtual environment
python -m venv wrangler-env

# Activate on Windows
./wrangler-env/Scripts/activate

# Activate on macOS/Linux
source wrangler-env/bin/activate

3. Install Dependencies

pip install -r requirements.txt

📁 Project Structure

flyfox/
├── config.yaml                 # Central configuration
├── predict.py                  # Main prediction script
├── test.py                     # Test runner
├── requirements.txt            # Python dependencies
├── pyproject.toml             # Project metadata
│
├── data/
│   ├── raw/                   # Raw CSV files
│   │   ├── Combined_Jobs_Final.csv
│   │   ├── Experience.csv
│   │   ├── Positions_Of_Interest.csv
│   │   └── labeled_applicant_job_pairs.csv
│   ├── interim/               # Processed intermediate data
│   └── features/              # Final feature matrices
│
├── embeddings/
│   ├── jobs/                  # Job embeddings (.npy)
│   └── applicants/            # Applicant embeddings (.npy)
│
├── features/
│   └── build_features.py      # Feature engineering scripts
│
├── src/
│   ├── features/              # Feature builders
│   ├── io/                    # File I/O utilities
│   ├── models/                # Model training & evaluation
│   ├── prep/                  # Data preparation helpers
│   ├── preprocessing/         # Text/vector preprocessing
│   ├── utils/                 # Shared utilities
│   └── api/                   # FastAPI application
│
└── docker/                    # Docker configurations

🚀 Usage

Generate Predictions

Run the main prediction script:

python predict.py

Run Tests

Execute the test suite:

python test.py

📊 Data Requirements

Required Files

Place the following files in data/raw/:

Combined_Jobs_Final.csv - Job postings with descriptions and metadata
Experience.csv - Applicant work experience records
Positions_Of_Interest.csv - Applicant job preferences
labeled_applicant_job_pairs.csv - Training data with applicant-job matches

Required Embeddings

Pre-generated embeddings must be stored as .npy dictionary files:

embeddings/jobs/embeddings_dict.npy - Job description embeddings
embeddings/applicants/embeddings_dict.npy - Applicant profile embeddings

📦 Dependencies

Core Libraries

Data Processing: pandas, numpy, pyarrow, fastparquet
Machine Learning: scikit-learn, lightgbm, xgboost
NLP & Embeddings: sentence-transformers, transformers, torch, gensim
API: fastapi, uvicorn
Database: psycopg2-binary (PostgreSQL support)

See requirements.txt for complete list with versions.

🧪 Testing

The project includes comprehensive testing:

# Run all tests
python test.py

# Run specific test modules
pytest tests/test_features.py
pytest tests/test_models.py

🔧 Configuration

Edit config.yaml to customize:

Model parameters
Feature engineering settings
API configuration
File paths and data sources

📝 Notes

Embeddings must be generated before running predictions
Ensure all required data files are present in data/raw/
The virtual environment (wrangler-env/) is excluded from version control
GPU acceleration recommended for embedding generation and model training

👤 Author

Ali Rassas

🔗 GitHub: @theflyfoxX

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🦊 Flyfox Job Matching Engine

🚀 Key Features

🛠️ Installation

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

📁 Project Structure

🚀 Usage

Generate Predictions

Run Tests

📊 Data Requirements

Required Files

Required Embeddings

📦 Dependencies

Core Libraries

🧪 Testing

🔧 Configuration

📝 Notes

👤 Author

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🦊 Flyfox Job Matching Engine

🚀 Key Features

🛠️ Installation

1. Clone the Repository

2. Create Virtual Environment

3. Install Dependencies

📁 Project Structure

🚀 Usage

Generate Predictions

Run Tests

📊 Data Requirements

Required Files

Required Embeddings

📦 Dependencies

Core Libraries

🧪 Testing

🔧 Configuration

📝 Notes

👤 Author