A real-time intelligence platform for recruiters tracking funding, acquisitions, layoffs, and executive movements in tech.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA SOURCES β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β RSS Feeds (32) β SEC Form D β Layoffs.fyi β YC Directory β
ββββββββββ¬ββββββββββ΄βββββββ¬ββββββββ΄ββββββββ¬ββββββββ΄βββββ¬βββββββββββββββββββββββ
β β β β
βΌ βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PIPELINE β
β ββββββββββββ βββββββββββββ βββββββββββββ ββββββββββββ β
β β 1. FETCH βββββΆβ2. CLASSIFYβββββΆβ3. EXTRACT βββββΆβ4. ENRICH β β
β ββββββββββββ βββββββββββββ βββββββββββββ ββββββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β raw_articles is_high_signal kg_entities kg_enrichment β
β (2000+ articles) event_type kg_relationships β
β processed=1 extracted=1 β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β KNOWLEDGE GRAPH β
β ENTITIES: company, person, investor, group β
β RELATIONSHIPS: ACQUIRED, FUNDED_BY, HIRED_BY, DEPARTED_FROM, LAID_OFF β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DASHBOARD β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββ
β β Stats β βAcquisitionsβ β Funding β β Hires β β Departures ββ
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββ
β ββββββββββββββββββββββββ ββββββββββββββββββββββββ βββββββββββββββββββββββ β
β β Companies Page β β Candidates Page β β Newsletter β β
β ββββββββββββββββββββββββ ββββββββββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
# Activate environment
source venv/bin/activate
# Run the daily pipeline (fetches news, extracts entities)
python scripts/run_daily.py
# Start the dashboard
python scripts/kg_viewer.py
# Open http://localhost:8000- Fetches from 32 RSS feeds (
config/feeds.json) - Also fetches SEC Form D, Layoffs.fyi, YC Directory
- Stores in
raw_articlestable
- Keywords match β
event_type: funding, acquisition, layoff, executive_move - Sets
is_high_signal=1for relevant articles - Sets
processed=1after classification
- Only processes WHERE
is_high_signal=1ANDextracted=0 - Uses LLM (Claude) to extract entities and relationships
- Stores in
kg_entitiesandkg_relationships - Sets
extracted=1AFTER successful extraction
- Web search adds context to entities
- Stores in
kg_enrichment
raw_articles (
id, title, content, url, source,
published_at, fetched_at,
processed, -- 1 after classification
is_high_signal, -- 1 if relevant to recruiting
event_type, -- funding, acquisition, layoff, executive_move
extracted -- 1 after LLM extraction (CRITICAL!)
)kg_entities (id, name, normalized_name, entity_type, attributes_json)
kg_relationships (subject_id, predicate, object_id, confidence, context, source_url)
kg_enrichment (entity_id, source, enrichment_json)| Predicate | Subject | Object | Dashboard Section |
|---|---|---|---|
ACQUIRED |
Acquirer | Target | Recent Acquisitions |
FUNDED_BY |
Company | Investor | Recent Funding |
HIRED_BY |
Person | Company | Recent Hires |
DEPARTED_FROM |
Person | Company | Recent Departures |
LAID_OFF |
Company | employees | Layoffs section |
FOUNDED |
Person | Company | Available Talent |
CEO_OF |
Person | Company | Executive info |
| Page | URL | Data Source |
|---|---|---|
| Dashboard | / |
All relationship types |
| Companies | /companies |
kg_entities WHERE entity_type='company' |
| Candidates | /candidates |
kg_entities WHERE entity_type='person' |
| Newsletter | /newsletter |
Generated from kg_relationships |
source venv/bin/activate && PYTHONPATH=. python3 << 'EOF'
import asyncio, sqlite3
from src.extraction.llm_extractor import LLMExtractor
from src.knowledge_graph.graph import KnowledgeGraph
extractor = LLMExtractor()
kg = KnowledgeGraph()
conn = sqlite3.connect('data/recruiter_intel.db')
cursor = conn.cursor()
cursor.execute("""
SELECT id, title, content, summary FROM raw_articles
WHERE is_high_signal = 1 AND extracted = 0
ORDER BY published_at DESC LIMIT 50
""")
async def extract():
for id, title, content, summary in cursor.fetchall():
result = await extractor.extract(title, content or summary or "")
if result.relationships:
kg.add_extraction_result(result)
cursor.execute("UPDATE raw_articles SET extracted = 1 WHERE id = ?", (id,))
conn.commit()
asyncio.run(extract())
print("Done!")
EOF| Issue | Symptom | Fix |
|---|---|---|
| Extraction not tracked | Dashboard shows old data | Added extracted column |
| Invalid entities | HTML artifacts in entity names | _sanitize_name() validation |
| Stale RSS news | Same news repeating | Removed when:30d from feeds |
| Layoffs.fyi 404 | No layoff data | Fallback data in scraper |
| YC API 403 | No YC companies | Fallback data in scraper |
# Check unextracted articles
sqlite3 data/recruiter_intel.db "SELECT COUNT(*) FROM raw_articles WHERE is_high_signal=1 AND extracted=0"
# If > 0, run manual extraction abovesqlite3 data/knowledge_graph.db "SELECT predicate, COUNT(*) FROM kg_relationships GROUP BY predicate"sqlite3 data/recruiter_intel.db "SELECT feed_name, total_articles, last_error FROM feed_stats ORDER BY total_articles DESC"recruiter-intelligence/
βββ config/feeds.json # RSS feed URLs
βββ data/
β βββ recruiter_intel.db # Articles, feed stats
β βββ knowledge_graph.db # Entities, relationships
β βββ newsletter.html # Generated newsletter
βββ scripts/
β βββ run_daily.py # Pipeline runner
β βββ kg_viewer.py # Dashboard server
βββ src/
β βββ ingestion/ # RSS, SEC, layoffs, YC scrapers
β βββ extraction/ # LLM extractor
β βββ knowledge_graph/ # Graph operations
β βββ newsletter/ # Newsletter generator
β βββ pipeline/daily.py # Pipeline orchestration
βββ .claude/ # Claude context files
RECRUITER INTELLIGENCE PIPELINE
==================================================
Articles: 300+ fetched, 50+ new
High signal: 50+
Relationships extracted: 50+
KNOWLEDGE GRAPH: 1500+ entities, 300+ relationships
MIT