A production-style backend service built with FastAPI that accepts raw text, performs sentiment analysis using a pretrained NLP model, and stores all requests and results in PostgreSQL.
This project is designed to demonstrate backend engineering skills with AI as a component, not as a research experiment.
Many AI demos focus on model accuracy or notebooks but ignore system concerns like persistence, observability, and API design.
This service focuses on the end-to-end backend workflow:
- Accept input via an HTTP API
- Perform AI inference
- Return structured output
- Persist results reliably
- Provide basic observability for debugging
Client
|
| POST /analyze
v
FastAPI API Layer
|
|-- Input validation (Pydantic)
|-- Request ID middleware
|
v
Inference Service
|
|-- Pretrained sentiment model
|
v
PostgreSQL Database
|
|-- Stores input, output, metadata
Key design principle: AI is treated as a replaceable component, not the core system.
- Language: Python
- API Framework: FastAPI
- Database: PostgreSQL
- ORM: SQLAlchemy
- AI Model: Pretrained Hugging Face sentiment model
- Environment Config: dotenv
- Observability: Structured logging + request IDs
POST /analyze
{
"text": "The app crashes and support never replies"
}{
"sentiment": "negative",
"confidence": 0.98
}Each response includes an X-Request-ID header for tracing.
analyses Table
| Field | Purpose |
| ---------- | ---------------------- |
| id | Primary key |
| input_text | Original user input |
| sentiment | Model prediction |
| confidence | Model confidence score |
| model_name | AI model identifier |
| created_at | Timestamp |
All requests are persisted to support auditing, debugging, and future analytics.
-
Uses a pretrained sentiment analysis model
-
Model is loaded once at application startup
-
Inference is isolated in a dedicated service module
The confidence score reflects the model’s internal probability, not ground truth correctness. Results may vary on domain-specific or ambiguous inputs.
-
Each request is assigned a unique request ID
-
Request IDs are:
-
logged at request entry
-
logged after inference
-
returned in response headers
-
This enables request-level tracing and debugging in distributed systems.
-
Single-process inference (no background workers)
-
Cold-start latency due to model loading
-
No rate limiting or authentication
-
No database migrations (manual table creation)
These tradeoffs were made intentionally to keep the system focused and understandable.
-
Alembic migrations for schema evolution
-
Background inference workers
-
Rate limiting and authentication
-
Model versioning and A/B testing
-
Caching frequent requests
This project demonstrates how to build a real backend service that integrates AI responsibly, with attention to architecture, persistence, and operational concerns.