An intelligent recommendation system for SHL assessments using semantic search and LLM re-ranking.
| Requirement | Status | Details |
|---|---|---|
| Crawl SHL Catalog | ✅ Done | 377 Individual Test Solutions crawled |
| Build Recommendation Engine | ✅ Done | Semantic search + LLM re-ranking |
| Natural Language Query Support | ✅ Done | Text/JD input supported |
| Return 1-10 Assessments | ✅ Done | Returns top 10 relevant assessments |
| Assessment Name + URL | ✅ Done | Full metadata included |
| API Endpoint (JSON) | ✅ Done | FastAPI at /recommend |
| Web Frontend | ✅ Done | Streamlit app |
| GitHub Repository | ✅ Done | Ready for submission |
| Submission CSV | ✅ Done | submission.csv generated |
| Mean Recall@10 Evaluation | ✅ Done | 22.11% with LLM reranking |
| LLM Integration | ✅ Done | Groq Llama-3.3-70B (Free) |
| Balanced Recommendations | ✅ Done | Hard + Soft skills mix |
- Semantic Search: sentence-transformers embeddings (all-MiniLM-L6-v2)
- LLM Re-ranking: Groq Llama-3.3-70B for intelligent re-ordering
- 377 Assessments: Complete SHL Individual Test Solutions catalog
- REST API: FastAPI backend with OpenAPI docs
- Web Interface: Streamlit frontend for testing
git clone https://github.com/yourusername/shl-recommendation-engine.git
cd shl-recommendation-engine
# Create virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # Linux/Mac
# Install dependencies
pip install -r requirements.txtFor LLM re-ranking, get a free HuggingFace API key:
- Go to https://huggingface.co and create a free account
- Go to Settings → Access Tokens (https://huggingface.co/settings/tokens)
- Click "New token" → Name it → Select "Read" → Create
- Copy the token (starts with
hf_...)
# Create .env file
echo "HF_API_KEY=hf_your_token_here" > .envOption A: Streamlit Frontend
streamlit run app.pyOpens at http://localhost:8501
Option B: FastAPI Backend
uvicorn api:app --reload- API: http://localhost:8000
- Docs: http://localhost:8000/docs
import requests
response = requests.post("http://localhost:8000/recommend", json={
"query": "Looking for Java developers with team collaboration skills",
"top_k": 10,
"use_llm": True
})
print(response.json()){
"query": "Looking for Java developers...",
"method": "semantic+llm_rerank",
"recommendations": [
{
"rank": 1,
"assessment_name": "Core Java (Entry Level) - New",
"url": "https://www.shl.com/products/...",
"test_types": ["Knowledge & Skills"],
"reason": "Directly assesses Java programming skills"
}
],
"processing_time_ms": 250.5
}┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ User Query │───▶│ Semantic Search │───▶│ LLM Re-ranking │
│ (Job Desc/URL) │ │ (FAISS + MiniLM)│ │ (Gemini Flash) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │
▼ ▼
┌──────────────────┐ ┌─────────────────┐
│ Top-20 Candidates│───▶│ Top-10 Balanced│
│ (by similarity) │ │ (hard+soft mix)│
└──────────────────┘ └─────────────────┘
shl-recommendation-engine/
├── api.py # FastAPI backend (POST /recommend, GET /health)
├── app.py # Streamlit frontend
├── pipeline.py # End-to-end pipeline
├── evaluate.py # Evaluation script (Mean Recall@10)
├── generate_submission.py # Generate submission CSV
├── submission.csv # Predictions for test set
├── requirements.txt # Dependencies
├── README.md # Documentation
├── .env.example # Environment template
├── data/
│ ├── catalog_enriched.json # 377 Individual Test Solutions
│ ├── vector_store.faiss # FAISS vector index
│ └── metadata.pkl # Test metadata
├── src/
│ ├── crawler.py # Web crawler for SHL catalog
│ └── vector_store.py # Vector store implementation
└── llm/
├── prompt.txt # LLM re-ranking prompt
└── rerank.py # Groq LLM re-ranking module
Mean Recall@10 on the training dataset:
| Method | Mean Recall@10 |
|---|---|
| Semantic Only | 20.56% |
| Semantic + LLM | 22.11% |
Run evaluation:
python evaluate.py --dataset "Gen_AI Dataset.xlsx"Generate submission:
python generate_submission.py --dataset "Gen_AI Dataset.xlsx" --output submission.csv| Variable | Description | Required |
|---|---|---|
GROQ_API_KEY |
Groq API key (FREE at console.groq.com) | For LLM re-ranking |
- Model:
all-MiniLM-L6-v2(384 dimensions) - Index: FAISS IndexFlatIP (cosine similarity)
- Candidate pool: 20 for re-ranking, configurable
| Code | Type | Description |
|---|---|---|
| K | Knowledge & Skills | Technical/hard skills (programming, etc.) |
| P | Personality | Personality assessments (OPQ, etc.) |
| B | Behavioral | Behavioral competencies |
| A | Ability | Cognitive abilities |
| S | Simulation | Job simulations |
python -c "from src.vector_store import SHLVectorStore; s = SHLVectorStore(); s.build_index(); s.save()"python -m pytest tests/MIT License
- SHL for the comprehensive assessment catalog
- Sentence Transformers for the embedding models
- Groq for providing free LLM API access