π Try the LinkedIn Sourcing Agent App on Streamlit
A professional-grade, enterprise-ready LinkedIn candidate sourcing and outreach automation system built for the Synapse AI Hackathon.
Transform your recruitment workflow with intelligent candidate discovery, automated scoring, personalized outreach generation, and seamless Excel/Google Sheets integration.
- Multi-source candidate discovery (LinkedIn, GitHub, Stack Overflow)
- Advanced search with location, skills, and experience filters
- Real-time candidate profile enrichment
- Duplicate detection and data deduplication
- AI-powered candidate fit scoring (0-100 scale)
- Customizable scoring rubrics for different roles
- Technical skills assessment
- Experience relevance analysis
- Confidence level indicators
- GPT-4 powered personalized message generation
- Template-based fallback system
- Role-specific messaging strategies
- Multi-channel outreach support (LinkedIn, Email)
- Excel Export: Multi-sheet workbooks with formatted data
- Google Sheets: Real-time collaborative spreadsheets
- JSON/CSV: Raw data for custom integrations
- Clean company names (no technical clutter)
- Complete LinkedIn URLs for easy access
- Rate limiting and API quota management
- Intelligent caching system
- Comprehensive logging and monitoring
- Error handling and recovery
- Scalable architecture
# Clone the repository
git clone <your-repo-url>
cd AL
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your API keysCreate your .env file:
# AI Configuration
GEMINI_API_KEY=your_gemini_api_key_here
OPENAI_API_KEY=your_openai_api_key_here # Optional
# LinkedIn APIs (Optional - will use demo data if not provided)
RAPIDAPI_KEY=your_rapidapi_key_here
# Environment
ENVIRONMENT=development
# Export Configuration
GOOGLE_SHEETS_SERVICE_ACCOUNT=service_account.json # Optional# Search for candidates and export to Excel
python -m linkedin_sourcing_agent.cli.main search \
--query "Python Developer" \
--location "San Francisco" \
--limit 10 \
--format excel
# Export to Google Sheets
python -m linkedin_sourcing_agent.cli.main search \
--query "Machine Learning Engineer" \
--location "New York" \
--limit 15 \
--format sheets \
--sheets-name "ML_Engineers_2025"# Start the web server
python api_server.py
# Server runs on http://localhost:8000
# API documentation: http://localhost:8000/docsfrom linkedin_sourcing_agent import LinkedInSourcingAgent
# Initialize the agent
agent = LinkedInSourcingAgent()
# Search for candidates
results = agent.search_candidates(
query="Senior Frontend Developer",
location="San Francisco, CA",
limit=20
)
# Export to Excel
agent.export_manager.export_to_excel(
results,
"frontend_developers.xlsx"
)| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Health check and system info |
/health |
GET | System health status |
/demo |
GET | Demo data for testing |
/source-candidates |
POST | Search and score candidates |
# Search for candidates via API
curl -X POST "http://localhost:8000/source-candidates" \
-H "Content-Type: application/json" \
-d '{
"query": "DevOps Engineer",
"location": "Seattle",
"limit": 10,
"job_description": "Looking for DevOps engineers with Kubernetes experience",
"export_excel": true
}'All outputs are organized in the outputs/ directory:
outputs/
βββ search_results/ # Raw search results
βββ processed_candidates/ # Scored and processed data
βββ excel_exports/ # Excel workbooks (.xlsx)
βββ json_data/ # JSON format exports
βββ README.md # Output documentation
Each Excel file contains multiple sheets:
- Candidates: Main candidate data with clean company names
- Contact_Info: Contact details and LinkedIn URLs
- Experience_Education: Professional background
- Skills_Scoring: Technical skills and fit scores
- Multi_Source_Data: GitHub, Twitter, additional profiles
- Generated_Messages: Personalized outreach messages
- Analytics: Search and scoring analytics
- Summary: Executive summary and statistics
# Create custom scoring configuration
scoring_config = {
"technical_skills": {
"weight": 0.4,
"required_skills": ["Python", "React", "AWS"],
"bonus_skills": ["Docker", "Kubernetes"]
},
"experience": {
"weight": 0.3,
"min_years": 3,
"relevant_industries": ["Tech", "Fintech"]
},
"education": {
"weight": 0.2,
"preferred_degrees": ["Computer Science", "Engineering"]
},
"location": {
"weight": 0.1,
"preferred_locations": ["San Francisco", "New York"]
}
}- Create a Google Cloud Service Account
- Download the JSON credentials file
- Save as
service_account.jsonin project root - Share your target Google Sheet with the service account email
See GOOGLE_SHEETS_SETUP.md for detailed instructions.
| Name | Company | LinkedIn_URL | Fit_Score | Title |
|---|---|---|---|---|
| Sarah Chen | https://linkedin.com/in/sarah-chen-ml | 92 | Senior ML Engineer | |
| Marcus Rodriguez | Meta | https://linkedin.com/in/marcus-rodriguez | 88 | Staff Software Engineer |
| Emma Thompson | Figma | https://linkedin.com/in/emma-thompson-frontend | 85 | Frontend Architect |
{
"job_id": "job_1751295835",
"candidates_found": 15,
"processing_time_seconds": 2.34,
"top_candidates": [
{
"name": "Sarah Chen",
"linkedin_url": "https://linkedin.com/in/sarah-chen-ml",
"headline": "Senior Machine Learning Engineer at Google",
"location": "Mountain View, CA",
"fit_score": 92,
"confidence": "high",
"key_characteristics": ["Python", "TensorFlow", "MLOps"],
"outreach_message": "Hi Sarah, I was impressed by your ML work at Google..."
}
],
"excel_file": "outputs/excel_exports/search_ML_Engineer_20250630_160038.xlsx"
}linkedin_sourcing_agent/
βββ core/ # Core agent logic
βββ scrapers/ # Data collection modules
βββ scoring/ # Candidate scoring system
βββ generators/ # Outreach message generation
βββ utils/ # Utilities and helpers
βββ cli/ # Command-line interface
βββ config/ # Configuration management
βββ examples/ # Usage examples
βββ tests/ # Test suite
βββ docs/ # Documentation
# Run all tests
python -m pytest
# Run specific test files
python -m pytest tests/test_linkedin_agent.py
# Run with coverage
python -m pytest --cov=linkedin_sourcing_agent# Test API endpoints
python test_api.py
# Test Excel export functionality
python test_excel_api.py| Command | Description | Example |
|---|---|---|
search |
Search for candidates | search --query "Python Dev" --limit 20 |
process |
Process existing candidate data | process --input candidates.json |
export |
Export data to various formats | export --format excel --input data.json |
setup |
Initial setup and configuration | setup --create-config |
validate |
Validate configuration and APIs | validate --check-apis |
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run tests (
python -m pytest) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- π Documentation: Check the
docs/directory - β FAQ: See
TECHNICAL_DOCS.md - π§ Troubleshooting: Check
WHERE_ARE_MY_OUTPUTS.md - π Google Sheets: See
GOOGLE_SHEETS_SETUP.md - π― Setup Guide: See
FREE_SETUP_GUIDE.md
This project is specifically designed for the Synapse AI Hackathon with:
- β Professional package structure
- β Complete API documentation
- β Easy demo and testing
- β Production-ready features
- β Comprehensive export options
- β Clean, judgeworthy codebase
# Start the demo in 30 seconds
python api_server.py &
curl http://localhost:8000/demo- Rate Limiting: 20 requests/minute for LinkedIn APIs
- Caching: Intelligent candidate data caching
- Batch Processing: Handle 100+ candidates efficiently
- Memory Management: Optimized for large datasets
- Error Recovery: Automatic retry with exponential backoff
Built with β€οΈ for the Synapse AI Hackathon
Transform your recruitment process with AI-powered candidate sourcing and outreach automation.