Prema Review Intelligence is a lightweight but production-ready prototype for automated analysis of large e-commerce review datasets.
It ingests raw CSV/JSON exports, stores them in SQLite, computes sentiment & thematic insights, exposes a FastAPI backend, and ships with a Streamlit dashboard for interactive exploration.
This project is part of the Prema Vision AI Automations portfolio.
- 📥 Ingest raw review files (CSV/JSON) with dataset metadata
- 🧹 Validate files (size, type, row limits) for safe ingestion
- 💾 Store reviews in SQLite using SQLModel
- 🧠 Analyze sentiment, themes, and summary statistics
- ⚡ Cache heavy analyses to avoid recomputation
- 🔌 Expose a clean API with FastAPI
- 📊 Interactive dashboard built with Streamlit
- 🧪 Testable architecture with sample data and simple unit tests
app/
analysis/ # Stats + theme extraction + LLM abstraction
api/ # FastAPI routes, dependencies, middleware
core/ # Settings, logging, environment config
db/ # SQLModel models and DB engine
ingestion/ # CSV/JSON ingestion pipeline
schemas/ # Pydantic DTOs for API transport
services/ # Orchestrators for datasets & analyses
dashboard/
app.py # Streamlit UI powered by the API
data/samples/ # Example datasets for demos
scripts/ # Helper scripts (e.g., ingest sample data)
tests/ # Lightweight unit tests
poetry installpoetry run python scripts/ingest_sample.pypoetry run uvicorn app.main:app --reloadpoetry run streamlit run dashboard/app.pyDefault dashboard API target:
http://localhost:8000
Override via: REVIEW_API_BASE_URL
GET /health
POST /datasets— upload CSV/JSON and ingestGET /datasets— list datasetsGET /datasets/{id}/summary?force={bool}— stats, sentiment, themesGET /datasets/{id}/themes— structured theme extraction
GET /reviews— filterable review list- params: dataset, rating range, paging
Interactive docs:
http://localhost:8000/docs
This project includes a practical security foundation:
- ✅ File type/size/content validation
- ✅ DoS protection via row & size limits
- ✅ CORS configuration via environment
- ✅ Rate limiting for LLM calls
- ✅ API key validation
- ✅ Input sanitization (Pydantic)
- ✅ Controlled DB access via SQLModel & engine pooling
For full guidance and deployment notes, see:
SECURITY.md
Defined via .env (see .env.example):
DATABASE_URL— defaultsqlite:///./data/app.dbMAX_THEMES— number of returned themesLLM_PROVIDER/OPENAI_API_KEY— optional LLM integrationCORS_ALLOWED_ORIGINSMAX_UPLOAD_SIZE— default 10MBMAX_INGESTION_ROWS— DoS protection (default: 50,000)LLM_RATE_LIMIT_RPM— default: 60
poetry run black .
poetry run isort .
poetry run ruff check .
poetry run pytestComprehensive e2e tests using Playwright are available in tests/e2e/:
# Install Playwright browsers (first time only)
poetry run playwright install chromium
# Run all e2e tests
poetry run pytest tests/e2e/ -v
# Run specific test categories
poetry run pytest tests/e2e/test_api_endpoints.py -v # API tests
poetry run pytest tests/e2e/test_dashboard_ui.py -v # UI tests
poetry run pytest tests/e2e/test_error_handling.py -v # Error handlingSee tests/e2e/README.md for detailed documentation.
- Swap
MockLLMClientfor OpenAI/Anthropic - Improve clustering & embeddings
- Add product-level insights (pricing, defect themes)
- Connect ingestion to live sources (Amazon, Shopify)
- Add multi-dataset comparison & trend analysis
Issues and PRs welcome.
Part of the Prema Vision internal AI automation suite.
MIT License.