zee404-code
diff --git a/‎README.md‎
Lines changed: 194 additions & 32 deletions b/‎README.md‎
Lines changed: 194 additions & 32 deletions
diff --git a/‎SECURITY.md‎
Lines changed: 43 additions & 0 deletions b/‎SECURITY.md‎
Lines changed: 43 additions & 0 deletions
diff --git a/‎assets/Artifact.png‎
49.6 KB b/‎assets/Artifact.png‎
49.6 KB
diff --git a/‎assets/Ec2 Instance.png‎
108 KB b/‎assets/Ec2 Instance.png‎
108 KB
diff --git a/‎assets/S3_bucket.png‎
5.65 KB b/‎assets/S3_bucket.png‎
5.65 KB
diff --git a/‎assets/conf1.png‎
31.4 KB b/‎assets/conf1.png‎
31.4 KB
diff --git a/‎assets/conf2.png‎
42.9 KB b/‎assets/conf2.png‎
42.9 KB
@@ -112,7 +112,50 @@ graph TD
 <br>
 <img src="assets/D2 S1.png" alt="testing cnic" width="500">
 
-## Guardrails & Safety Mechanisms (D3)
+## Responsible AI & Guardrails
+
+The **Daraz Insight Copilot** is built to be a **helpful, harmless, and honest** AI assistant dedicated exclusively to e-commerce analytics on the Daraz platform. We enforce strong Responsible AI principles through multiple technical and prompt-based guardrails.
+
+## 1. Topic Restriction Guardrails
+**Goal**: Prevent discussion of sensitive, political, illegal, or harmful topics unrelated to e-commerce.
+
+- **Implementation**
+  The system prompt strictly defines the assistant’s persona:
+  > “You are an e-commerce analyst for Daraz. You only answer questions using the provided product reviews and data.”
+
+- **Behavior**
+  Off-topic queries (e.g., “Who is the president?”, “How do I make a bomb?”, or any non-e-commerce request) trigger retrieval of irrelevant product reviews. With no relevant context available, the model is instructed to respond:
+  > “I cannot answer this question based on the available product data.”
+
+## 2. Anti-Hallucination Measures
+**Goal**: Eliminate invented products, fake reviews, or fabricated insights.
+
+- **Implementation**
+  Full **Retrieval-Augmented Generation (RAG)** workflow powered by a FAISS vector index of real Daraz reviews.
+
+- **Constraint**
+  The model is explicitly prohibited from using its pre-trained knowledge for product-related facts. Every factual claim **must** be grounded in retrieved chunks, ensuring 100% traceability to real data.
+
+## 3. Tone & Style Guidelines
+**Goal**: Maintain professional, objective, and business-appropriate communication.
+
+- **Implementation**
+  Prompt instructions enforce:
+  > “Responses must be concise, data-driven, and professional.”
+
+- **Result**
+  No slang, memes, overly casual language, or aggressive tone — ideal for analysts, merchants, and business users.
+
+## 4. Bias Mitigation
+**Goal**: Deliver fair and accurate sentiment analysis across diverse user expressions.
+
+- **Dataset**
+  Includes mixed-language (English + Roman Urdu) reviews to properly capture local nuances and avoid penalizing non-native English speakers.
+
+- **Transparency**
+  Every answer includes the exact review snippets used as sources, enabling users to verify that summaries faithfully reflect the underlying data.
+
+By combining strict persona definition, RAG grounding, clear style rules, and transparent sourcing, Daraz Insight Copilot stays focused, truthful, and responsible at all times.
 
 We implemented a custom **Policy Engine** (`src/app/guardrails.py`) that intercepts requests at two stages to ensure system safety and compliance.
 
@@ -194,9 +237,13 @@ We track operational metrics for the RAG pipeline using a Grafana dashboard.
 * **Token Usage & Cost:** Tracks `llm_token_usage_total` to estimate API costs ($0.50/1M input, $1.50/1M output).
 * **RAG Latency:** Monitors the P95 and P99 latency of the `/ask` endpoint to ensure responsiveness.
 * **Safety Violations:** Logs `guardrail_events_total` to track attempted attacks (Injection/PII).
+
 <img src="assets/D4 S1.png" alt="http request total" width="500">
+<br>
 <img src="assets/D4 S2.png" alt="llm token usage total" width="500">
+<br>
 <img src="assets/D4 S3.png" alt="guardrail events total" width="500">
+<br>
 <img src="assets/D4 S4.png" alt="Grafana Dashboard" width="500">
 
 ### 2. Data Drift Monitoring (Evidently)
@@ -208,46 +255,161 @@ We monitor the integrity of our retrieval corpus and tabular data using **Eviden
 
 ## Cloud Deployment
 
-This project is deployed and hosted on **Amazon Web Services (AWS)** using three distinct services: **EC2**, **S3**, and **CloudWatch**, fulfilling the D9 requirement.
+This project is deployed and hosted on **Amazon Web Services (AWS)**
+using three key services: **EC2**, **S3**, and **CloudWatch**,
+fulfilling the **D9 Deployment requirement**.
+
+
+<img src="assets/conf1.png" alt="Configuration 1" width="400">
+<br>
+<img src="assets/conf2.png" alt="Configuration 2" width="400">
+
+
+## How the ML Workflow Interacts with AWS
+
+### **1. Training (Local)**
+
+The machine learning model is trained locally using:
+
+``` bash
+python train.py
+```
+
+This produces the following artifacts:
+
+-   `model.joblib`
+-   `faiss_index/` folder
+-   `my_model/` folder (Sentence Transformer model)
+
+### **2. Artifact Storage (S3)**
+
+Due to GitHub file-size limits and Docker build timeouts, heavy
+artifacts are stored in **Amazon S3**.
+
+Artifacts stored:
+
+-   `daraz-code-mixed-product-reviews.csv`
+-   `faiss_index/` (Vector Database)
+-   `my_model/` (Sentence Transformer)
+
+<img src="assets/S3_bucket.png" alt="S3_bucket after milestone 2" width="500">
 
-### How the ML Workflow Interacts with AWS
+### **3. Inference (EC2)**
 
-1.  **Training (Local):** The model is trained locally using `python train.py`. This script also generates the `model.joblib` artifact.
-2.  **Data/Model Storage (S3):** The `Top_Selling_Product_Data.csv` dataset and the final `model.joblib` artifact are manually uploaded to an **S3 bucket** for persistent, durable storage.
-3.  **Inference (EC2):** An **EC2 instance** runs our Docker container. The container (built by the CI/CD pipeline) pulls the code, and when the API starts, it loads the model from its local `models/` directory (which was part of the build).
-4.  **Monitoring (CloudWatch):** **CloudWatch** automatically monitors the EC2 instance (CPU, Network, etc.) to ensure the health of our inference API server.
+An **EC2 t3.micro** instance hosts the live FastAPI application.
 
-### Services Used
+Instead of downloading models inside Docker (which causes timeouts), the
+instance downloads artifacts from S3 **at the host level**, and they are
+mounted into the container.
 
-* **S3 (Simple Storage Service):** Used for persistent data storage.
-    * **Why:** S3 is a highly durable and scalable service perfect for storing project artifacts.
-    * **How:** The `Top_Selling_Product_Data.csv` dataset and the trained `model.joblib` are stored in an S3 bucket.
-    * <img src="assets/Annonated_S3_bucket.png" alt="S3 Bucket" width="500">
+<img src="assets/Ec2 Instance.png" alt="Ec2 Instance after milestone 2" width="500">
 
-* **EC2 (Elastic Compute Cloud):** Used to host the live inference API.
-    * **Why:** An EC2 `t3.micro` instance provides a free-tier-eligible virtual server to run our Docker container 24/7.
-    * **How:** The API's Docker image (built by our CI/CD pipeline and stored in GHCR) is pulled and run on an Amazon Linux EC2 instance.
-    * <img src="assets/Annonated_Instance.png" alt="Instance Created" width="500">
-    * <img src="assets/Annonated_API_running_on_AWS.png" alt="API running on AWS" width="500">
+### **4. Monitoring (CloudWatch)**
 
-* **CloudWatch:** Used for basic infrastructure monitoring.
-    * **Why:** CloudWatch is automatically integrated with EC2 and provides essential metrics (CPU, Network, Disk) to monitor the health and performance of our API server.
-    * <img src="assets/Annonated_CloudWatch_Monitoring.png" alt="CloudWatch Monitoring" width="500">
+CloudWatch is used to track:
 
-### Reproduction Steps & Security
+-   CPU usage\
+-   Memory usage
 
-1.  Launch a `t2.micro` EC2 instance with an Amazon Linux AMI.
-2.  **Configure the Security Group:** To follow security best practices, access is restricted.
-    * Allow inbound TCP traffic on port `22` (SSH) from `My IP`.
-    * Allow inbound TCP traffic on port `8000` (API) from `My IP`.
-3.  Connect to the instance via SSH.
-4.  Install Docker: `sudo yum update -y && sudo yum install docker -y && sudo service docker start && sudo usermod -a -G docker ec2-user`
-5.  Log out and log back in.
-6.  Log in to GHCR: `docker login ghcr.io -u <YOUR-GITHUB-USERNAME>` (use a PAT as the password).
-7.  Pull and run the image: `docker pull ghcr.io/zee404-code/daraz-insight-copilot:latest`
-8.  Run: `docker run -d -p 8000:8000 ghcr.io/zee404-code/daraz-insight-copilot:latest`
-9.  Access the API at `http://<YOUR_EC2_PUBLIC_IP>:8000/docs`.
+## AWS Services Used
 
+### **S3 --- Simple Storage Service**
+
+Stores heavy ML artifacts and acts as the "teleporter" between local
+development and the EC2 server.
+
+### **EC2 --- Elastic Compute Cloud**
+
+Runs the production API using Docker with host networking.
+
+### **GHCR --- GitHub Container Registry**
+
+Stores the lightweight Docker image.
+
+# 🚀 Deployment Guide (Full Reproduction Steps)
+
+## **1. Infrastructure Setup**
+
+### **S3**
+
+Create a bucket (example: `daraz-insight-artifacts-fz`) and upload:
+
+-   `reviews.csv`
+-   `faiss_index/`
+-   `my_model/`
+
+### **EC2**
+
+-   Start a **t3.micro** instance\
+-   OS: Amazon Linux 2023
+
+### **Security Groups**
+
+  Port   Purpose      Source
+  ------ ------------ -----------
+  22     SSH          Your IP
+  8000   API Access   0.0.0.0/0
+
+## **2. Server Configuration**
+
+``` bash
+sudo yum update -y
+sudo yum install docker -y
+sudo service docker start
+sudo usermod -a -G docker ec2-user
+```
+
+### **Create Swap Memory**
+
+``` bash
+sudo dd if=/dev/zero of=/swapfile bs=128M count=32
+sudo chmod 600 /swapfile
+sudo mkswap /swapfile
+sudo swapon /swapfile
+exit
+```
+
+## **3. Artifact Injection**
+
+``` bash
+aws configure
+aws s3 cp s3://daraz-insight-artifacts-fz/daraz-code-mixed-product-reviews.csv reviews.csv
+aws s3 cp s3://daraz-insight-artifacts-fz/faiss_index/ faiss_index/ --recursive
+aws s3 cp s3://daraz-insight-artifacts-fz/my_model/ my_model/ --recursive
+```
+
+## **4. Run the Container**
+
+``` bash
+docker login ghcr.io -u <YOUR_GITHUB_USERNAME>
+docker pull ghcr.io/zee404-code/daraz-insight-copilot:latest
+```
+
+``` bash
+docker run -d \
+  --network host \
+  --dns 8.8.8.8 \
+  -v "$(pwd)/reviews.csv:/app/reviews.csv" \
+  -v "$(pwd)/faiss_index:/app/faiss_index" \
+  -v "$(pwd)/my_model:/app/my_model" \
+  -e GROQ_API_KEY="<YOUR_GROQ_API_KEY>" \
+  -e TRANSFORMERS_OFFLINE=1 \
+  -e HF_HUB_OFFLINE=1 \
+  -e SENTENCE_TRANSFORMERS_HOME="/app/my_model" \
+  --name api-app \
+  ghcr.io/zee404-code/daraz-insight-copilot:latest
+```
+
+# ✅ Verification
+
+Visit:
+
+    http://<EC2_PUBLIC_IP>:8000/docs
+
+You should see the live Swagger documentation.
+
+<img src="assets/Artifact.png" alt="Artifact after milestone 2" width="500">
+<br>
 
 ## Make Targets
 
 
@@ -0,0 +1,43 @@
+# Security Policy
+
+## Prompt Injection Defenses
+
+This application processes user input through Large Language Models (LLMs). To mitigate **Prompt Injection** attacks (where malicious input attempts to override system instructions), we have implemented multiple layers of defense:
+
+1. **System Prompt Encapsulation**
+   - All user input is strictly encapsulated using clear delimiters (e.g., `"""User Query"""`) before being passed to the LLM.
+   - The system prompt explicitly instructs the model to respond **only** based on the provided retrieved context and Daraz product information.
+
+2. **Context Grounding (RAG-based)**
+   - The model is forbidden from hallucinating or using external knowledge.
+   - Responses must cite specific reviews or product data retrieved from the `faiss_index`.
+   - If the retrieved context is irrelevant or insufficient, the model falls back to a safe response such as “I cannot answer that.”
+
+3. **Input Sanitization**
+   - All incoming API requests are validated using Pydantic models to enforce correct data types, formats, and length limits, preventing injection via malformed payloads.
+
+## Data Privacy & Handling
+
+We prioritize user privacy and minimize data retention:
+
+1. **No Permanent Storage of Queries**
+   - Queries sent to the `/ask` endpoint are processed entirely in memory.
+   - User queries are **never** persisted to databases, S3 buckets, or permanent logs. Server logs are ephemeral and automatically rotated.
+
+2. **PII Redaction**
+   - Current dataset (`reviews.csv`) contains only public product reviews; any potential PII in the source data is considered public domain.
+   - **Planned**: Automated detection and redaction of PII in incoming user queries in future releases.
+
+3. **API Key Security**
+   - LLM provider API keys (Groq, Gemini, etc.) are injected exclusively as environment variables at runtime.
+   - Keys are **never** hardcoded or committed to the repository.
+
+## Reporting Security Vulnerabilities
+
+If you discover a security vulnerability, a bypass of the implemented guardrails, or any other security issue:
+
+**Please do not open a public GitHub issue.**
+
+Instead, report it privately by emailing the repository maintainer directly. Responsible disclosures will be acknowledged and addressed promptly.
+
+Thank you for helping keep this project secure!