CS 5588 — Week 4 Capstone Module Drug Label Evidence RAG System
Team: Salman Mirza, Amy Ngo, Nithin Songala
TruPharma is a Retrieval-Augmented Generation (RAG) application that answers drug-label questions using official FDA data from the openFDA Drug Label API. The system fetches real-time drug labeling records, indexes them with hybrid retrieval (dense + sparse), and generates grounded answers with evidence citations.
| Persona | Example Task |
|---|---|
| Pharmacist | "What dosage of acetaminophen is recommended and what are the warnings?" |
| Clinician | "What drug interactions should I know about for ibuprofen?" |
| Patient | "I take aspirin daily — when should I stop use?" |
Provides faster time-to-answer with higher trust by returning an evidence pack (drug label sections) and a citation-enforced grounded answer, refusing when evidence is insufficient.
┌──────────────────────────────────────────────────────────┐
│ Streamlit UI (Frontend) │
│ Query Input · Response · Evidence · Metrics/Logs │
└────────────────────────┬─────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ RAG Engine (rag_engine.py) │
│ │
│ 1. Build openFDA search query from user text │
│ 2. Fetch drug label records via openFDA API │
│ 3. Chunk text fields (10 selected label sections) │
│ 4. Index: FAISS (dense) + BM25 (sparse) │
│ 5. Hybrid retrieval with reciprocal rank fusion │
│ 6. Generate answer (Gemini LLM or extractive fallback) │
│ 7. Log interaction to CSV │
└────────┬──────────────────────────┬──────────────────────┘
│ │
▼ ▼
┌─────────────────┐ ┌───────────────────────────┐
│ openFDA API │ │ Google Gemini 2.0 Flash │
│ (Drug Labels) │ │ (Optional LLM grounding) │
└─────────────────┘ └───────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ logs/product_metrics.csv │
│ timestamp · query · latency · evidence_ids · confidence │
└──────────────────────────────────────────────────────────┘
- User enters a drug-related question in the Streamlit UI
rag_engine.pyconverts the question into an openFDA API search query- Relevant drug label records are fetched in real-time from FDA servers
- Text is chunked and indexed using dual retrieval (FAISS inner-product + BM25)
- Top-K evidence is retrieved via hybrid fusion (dense + sparse)
- A grounded answer is generated with citations (Gemini LLM or extractive fallback)
- The interaction is logged to
logs/product_metrics.csv - Results displayed: answer, evidence artifacts, latency metrics, and logs
| Field | Purpose |
|---|---|
active_ingredient |
Medicinal ingredients |
description |
Drug product overview |
dosage_and_administration |
Dosing guidance |
drug_interactions |
Drug/drug and drug/food interactions |
information_for_patients |
Patient safety info |
when_using |
Side effects and activity warnings |
overdosage |
Overdose symptoms and treatment |
stop_use |
When to stop and consult a doctor |
user_safety_warnings |
Hazard warnings |
warnings |
Serious adverse reactions |
Live App: https://trupharma-clinical-intelligence-fhu8qhqrgjch9yhocjaeuz.streamlit.app/
# 1. Clone the repo
git https://github.com/reddy-nithin/TruPharma-Clinical-Intelligence
cd TruPharma-Clinical-Intelligence
# 2. Create a virtual environment (recommended)
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # macOS/Linux
# 3. Install dependencies
pip install -r requirements.txt
# 4. Build the knowledge graph
python3 scripts/build_kg.py
# 5. Run the Streamlit app
streamlit run src/frontend/app.pyTo use Google Gemini for answer generation instead of the extractive fallback:
- Get a free API key at Google AI Studio
- Enter it in the app sidebar under Advanced Settings > Gemini API key
All query interactions are logged to logs/product_metrics.csv with the following fields:
| Column | Description |
|---|---|
timestamp |
UTC timestamp of the query |
query |
User's question (truncated to 200 chars) |
latency_ms |
End-to-end pipeline latency in milliseconds |
evidence_ids |
Chunk IDs of retrieved evidence |
confidence |
Heuristic confidence score (0–1) |
num_evidence |
Number of evidence items returned |
num_records |
Drug label records fetched from FDA API |
retrieval_method |
hybrid / dense / sparse |
llm_used |
Whether Gemini LLM was used |
answer_preview |
First 150 chars of the generated answer |
Scenario: openFDA API returns 0 results for an obscure or misspelled drug name.
Mitigation:
- The system detects empty result sets and returns a clear "Not enough evidence" message rather than hallucinating
- Logging captures the failed query for later analysis
- Future improvement: add fuzzy drug-name matching and spell-check suggestions before querying the API
| Aspect | Approach |
|---|---|
| Hosting | Streamlit Community Cloud (free tier) |
| Data | Real-time openFDA API (no local data storage needed) |
| Scaling | API rate limits managed via pagination; add API key for higher limits |
| Monitoring | CSV-based logging; extend to cloud logging (e.g., CloudWatch) for production |
| CI/CD | GitHub integration with Streamlit Cloud for auto-deploy on push |
Week-4-Assignment--main/
├── data/ # Data directory (placeholder)
├── logs/
│ └── product_metrics.csv # Interaction logs (≥5 records)
├── src/
│ ├── openfda_rag.py # openFDA API fetching, chunking, indexing
│ ├── rag_engine.py # RAG pipeline: retrieve → generate → log
│ ├── Week 4.ipynb # Development notebook
│ └── app/
│ ├── .streamlit/config.toml # Streamlit theme config
│ ├── streamlit_app.py # Main app (Safety Chat)
│ └── pages/
│ └── stress_test.py # Stress test / scenario validation
├── requirements.txt
└── README.md
- Workflow improvement: Reduces manual label scanning from 10–15 min to under 30 sec per question
- Time-to-decision: Estimated 80% reduction in time-to-answer for drug-label queries
- Trust indicators: Every answer includes evidence chunk IDs, source fields, and confidence scores; system refuses to answer when evidence is insufficient
CS 5588 · Spring 2026 · Week 4 Assignment