An AI-powered tool for compliance analysts to screen individuals against news articles for adverse media mentions. The system uses large language models to assess an article's credibility, extract all person entities and person-person relationships, match the analyst person query to article person entities and assess whether the coverage is adverse.
Project Context: This was developed as a technical assessment for a compliance software role. The repository started from a Next.js + tRPC template provided by the hiring team (commit
d970003). All subsequent development is original work.
- Credibility Assessment: Evaluates article reliability before performing expensive analysis
- Extracts all person entities from the article and the relationships between entities (for network understanding), resolving coreferencing as much as possible.
- Person Matching: LLM-based analysis matches individuals against article mentions with explainable confidence scores and detailed signal breakdowns.
- Adverse Media Sentiment Analysis: Identifies negative mentions with categorized risk levels (fraud, corruption, sanctions, etc.)
- Results Persistence: Saves screening results to disk (seemed fine for an MVP)
- Explainable Outputs: Every decision includes structured reasoning, evidence spans, and confidence scores
- Multi-Provider Support: Works with OpenAI or (possibly :p) Anthropic LLMs (Only developed with openai though and not tested anthropic), where model can be defined in settings environment.
The system runs a straightforward pipeline: scrape article β assess credibility β extract entities β match person β analyse sentiment β save result.
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Browser (React UI) β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js + tRPC (Web Service) β
ββββββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββ
β
βΌ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FastAPI (AI Service Pipeline) β
β β
β 1. Scrape Article (newspaper3k) β
β 2. Check Credibility (LLM) βββββββββ β
β 3. Extract Entities (LLM) βββββββββββ€ β
β 4. Match Person (LLM) βββββββββββββββΌββΊ OpenAI/Anthropic β
β 5. Analyse Sentiment (LLM) ββββββββββ€ β
β 6. Save Result (file storage) βββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Key points: Everything uses LLMs with temperature=0.0 for consistency. All outputs are structured (Pydantic models) with reasoning attached for explainability. The system is conservative - when uncertain, it flags for human review rather than making assumptions. Results are saved automatically so there's an audit trail.
- Docker with Compose v2 - Install Docker Desktop
- OpenAI API Key - Get one here
- Make (optional) - provides shortcut commands
Verify Docker is installed:
Docker is required for running the app as communication between Next.js and fastapi servers happens over the docker network, but let me know if you encounter any issues and I can address that quickly.
docker compose version # Should show v2.x.x-
Run the setup script:
make setup
Or without Make:
bash scripts/setup.sh
This will:
- Check that Docker and Docker Compose v2 are installed and running
- Create
.env.secretstemplate files for you to add real API keys
-
Add your real OpenAI API key to
services/ai/.env.secrets:The
.env.defaultsfile contains placeholder API keys. You need to override them with your real key:# Open services/ai/.env.secrets and add your real key: OPENAI__API_KEY=sk-your-actual-openai-api-key-hereNote: The placeholder keys in
.env.defaultswon't work - you must add your real API key to.env.secrets. -
Build and start:
make start
Or without Make:
cd docker && docker compose build && docker compose up -d
That's it! Access at:
- π Web UI: http://localhost:3000
- π API Docs: http://localhost:5001/docs
- Navigate to http://localhost:3000
- Enter the article URL
- Provide person details:
- First name, last name (required)
- Middle name(s) (optional)
- Date of birth (optional, improves matching accuracy)
- Click "Screen Article"
- Review the detailed results with:
- Article credibility assessment
- Person matching analysis with confidence scores
- Adverse media sentiment (if matched)
- Click "View Results" in the navbar
- Browse all past screenings
- Click any result card to see full details
The Makefile supports flexible targeting with SERVICE and ENV variables:
# Build services
make build # Build all services
make build SERVICE=ai # Build AI service only
make build SERVICE=web # Build web service only
# Start services
make start # Start all services in production mode
make start ENV=dev # Start all services in development mode
make start SERVICE=ai # Start AI service only
make start ENV=dev SERVICE=web # Start web in development mode
# Other commands
make stop [SERVICE=all] # Stop services
make restart [SERVICE=all] # Restart services
make logs [SERVICE=all] # View logs (follow mode)
make ps # Show service status
make shell SERVICE=ai # Open shell in AI service
make shell SERVICE=web # Open shell in web serviceVariables:
SERVICE=ai|web|all- Target specific service (default: all)ENV=prod|dev- Target environment (default: prod)
cd docker
docker compose build # Build services
docker compose up -d # Start (detached)
docker compose down # Stop and remove
docker compose logs -f # View logs
docker compose ps # Service status
docker compose exec ai sh # Shell in AI service
docker compose exec web sh # Shell in web serviceThe application uses a two-file environment configuration system:
.env.defaults(committed to git): Contains all configuration with sensible defaults and placeholder API keys.env.secrets(gitignored): Contains your real API keys and any overrides
This separation keeps secrets out of version control while making configuration transparent.
AI Service (services/ai/):
.env.defaults- Default LLM models, temperature, log level, and placeholder API keys.env.secrets- Your real OpenAI and Anthropic API keys (override placeholders here)
Web Service (services/web/):
.env.defaults- AI service URL and Node environment.env.secrets- Any environment-specific overrides (optional)
Both config.py (Python) and docker-compose.yml load .env.defaults first, then .env.secrets. Any values in .env.secrets override those in .env.defaults. This means:
- The placeholder keys in
.env.defaultsare not functional - they're just examples - You must add your real API keys to
.env.secretsto use the application - You can override any other setting (model, log level, etc.) in
.env.secretswithout modifying the committed defaults
Screening results are automatically saved to services/ai/results/. This directory is gitignored and created automatically by Docker on first run. You'll start with an empty results list and build your screening history as you use the tool.
Production Mode (default - faster UX, optimized builds):
make start
# or: docker compose up ai webDevelopment Mode (hot reload for code changes):
make start ENV=dev
# or: docker compose up ai web-dev.
βββ docker/ # Docker configuration
β βββ docker-compose.yml # Service orchestration
β βββ ai.Dockerfile # AI service image
β βββ web.Dockerfile # Web service image
βββ services/
β βββ ai/ # FastAPI backend
β β βββ app/ # Application code
β β β βββ config.py # Settings management
β β β βββ dependencies.py # Dependency injection
β β β βββ routes/ # API endpoints
β β β βββ services/ # Core pipeline stages
β β β βββ utils/ # Utilities
β β βββ results/ # Saved screening results (gitignored)
β β βββ .env.defaults # Default configuration
β β βββ pyproject.toml # Python dependencies
β βββ web/ # Next.js frontend
β βββ src/
β β βββ app/ # Next.js App Router
β β βββ lib/ # Utilities
β β βββ server/ # tRPC server
β β βββ types/ # TypeScript definitions
β βββ .env.defaults # Default configuration
β βββ package.json # Node dependencies
βββ Makefile # Convenience commands
βββ README.md # This file
Once running, visit http://localhost:5001/docs for interactive API documentation (Swagger UI).
POST /screening/screen- Perform a new screeningGET /screening/results- List all saved resultsGET /screening/results/{id}- Get specific resultGET /health- Health check
make format # Format Python code (black + isort)
make lint # Lint Python code (flake8)
make test # Run tests (pytest)# All services
make logs
# Specific service only
make logs SERVICE=ai
make logs SERVICE=web
# Development mode logs
make logs ENV=dev# Rebuild specific service
make rebuild SERVICE=ai
make rebuild SERVICE=web
# Rebuild all services
make rebuildCheck Docker is running:
docker psCheck ports are available:
# On Mac/Linux
lsof -i :3000
lsof -i :5001
# On Windows
netstat -ano | findstr :3000
netstat -ano | findstr :5001View service logs:
make logs"Invalid API key":
- Verify your API key in
services/ai/.env.secrets - Check you're using the correct provider (OpenAI vs Anthropic)
"Rate limit exceeded":
- Your API account has hit rate limits
- Wait and retry, or upgrade your API plan
"Model not found":
- Verify the model name in
.env.defaults(or override in.env.secrets) matches your provider - Check your API account has access to the model
Check volume mount:
ls -la services/ai/results/
# Should show: data/ and index.jsonCheck permissions:
# Results directory should be writable
chmod -R 755 services/ai/results/View storage logs:
cd docker && docker compose logs -f ai | grep -i resultClear Next.js cache:
cd services/web
rm -rf .next
npm run buildReinstall dependencies:
cd services/web
rm -rf node_modules package-lock.json
npm install- Check the logs:
make logsormake logs SERVICE=ai - Verify environment variables:
cat services/ai/.env.defaults services/ai/.env.secrets - Restart services:
make restart - Rebuild from scratch:
make clean && make build && make start - Check specific service:
make logs SERVICE=webormake shell SERVICE=ai