Skip to content

Commit f33eea0

Browse files
committed
feat: Cloud Integration, Security and Compliance
1 parent 8c7d023 commit f33eea0

File tree

7 files changed

+237
-32
lines changed

7 files changed

+237
-32
lines changed

README.md

Lines changed: 194 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,50 @@ graph TD
112112
<br>
113113
<img src="assets/D2 S1.png" alt="testing cnic" width="500">
114114

115-
## Guardrails & Safety Mechanisms (D3)
115+
## Responsible AI & Guardrails
116+
117+
The **Daraz Insight Copilot** is built to be a **helpful, harmless, and honest** AI assistant dedicated exclusively to e-commerce analytics on the Daraz platform. We enforce strong Responsible AI principles through multiple technical and prompt-based guardrails.
118+
119+
## 1. Topic Restriction Guardrails
120+
**Goal**: Prevent discussion of sensitive, political, illegal, or harmful topics unrelated to e-commerce.
121+
122+
- **Implementation**
123+
The system prompt strictly defines the assistant’s persona:
124+
> “You are an e-commerce analyst for Daraz. You only answer questions using the provided product reviews and data.”
125+
126+
- **Behavior**
127+
Off-topic queries (e.g., “Who is the president?”, “How do I make a bomb?”, or any non-e-commerce request) trigger retrieval of irrelevant product reviews. With no relevant context available, the model is instructed to respond:
128+
> “I cannot answer this question based on the available product data.”
129+
130+
## 2. Anti-Hallucination Measures
131+
**Goal**: Eliminate invented products, fake reviews, or fabricated insights.
132+
133+
- **Implementation**
134+
Full **Retrieval-Augmented Generation (RAG)** workflow powered by a FAISS vector index of real Daraz reviews.
135+
136+
- **Constraint**
137+
The model is explicitly prohibited from using its pre-trained knowledge for product-related facts. Every factual claim **must** be grounded in retrieved chunks, ensuring 100% traceability to real data.
138+
139+
## 3. Tone & Style Guidelines
140+
**Goal**: Maintain professional, objective, and business-appropriate communication.
141+
142+
- **Implementation**
143+
Prompt instructions enforce:
144+
> “Responses must be concise, data-driven, and professional.”
145+
146+
- **Result**
147+
No slang, memes, overly casual language, or aggressive tone — ideal for analysts, merchants, and business users.
148+
149+
## 4. Bias Mitigation
150+
**Goal**: Deliver fair and accurate sentiment analysis across diverse user expressions.
151+
152+
- **Dataset**
153+
Includes mixed-language (English + Roman Urdu) reviews to properly capture local nuances and avoid penalizing non-native English speakers.
154+
155+
- **Transparency**
156+
Every answer includes the exact review snippets used as sources, enabling users to verify that summaries faithfully reflect the underlying data.
157+
158+
By combining strict persona definition, RAG grounding, clear style rules, and transparent sourcing, Daraz Insight Copilot stays focused, truthful, and responsible at all times.
116159

117160
We implemented a custom **Policy Engine** (`src/app/guardrails.py`) that intercepts requests at two stages to ensure system safety and compliance.
118161

@@ -194,9 +237,13 @@ We track operational metrics for the RAG pipeline using a Grafana dashboard.
194237
* **Token Usage & Cost:** Tracks `llm_token_usage_total` to estimate API costs ($0.50/1M input, $1.50/1M output).
195238
* **RAG Latency:** Monitors the P95 and P99 latency of the `/ask` endpoint to ensure responsiveness.
196239
* **Safety Violations:** Logs `guardrail_events_total` to track attempted attacks (Injection/PII).
240+
197241
<img src="assets/D4 S1.png" alt="http request total" width="500">
242+
<br>
198243
<img src="assets/D4 S2.png" alt="llm token usage total" width="500">
244+
<br>
199245
<img src="assets/D4 S3.png" alt="guardrail events total" width="500">
246+
<br>
200247
<img src="assets/D4 S4.png" alt="Grafana Dashboard" width="500">
201248
202249
### 2. Data Drift Monitoring (Evidently)
@@ -208,46 +255,161 @@ We monitor the integrity of our retrieval corpus and tabular data using **Eviden
208255
209256
## Cloud Deployment
210257
211-
This project is deployed and hosted on **Amazon Web Services (AWS)** using three distinct services: **EC2**, **S3**, and **CloudWatch**, fulfilling the D9 requirement.
258+
This project is deployed and hosted on **Amazon Web Services (AWS)**
259+
using three key services: **EC2**, **S3**, and **CloudWatch**,
260+
fulfilling the **D9 Deployment requirement**.
261+
262+
263+
<img src="assets/conf1.png" alt="Configuration 1" width="400">
264+
<br>
265+
<img src="assets/conf2.png" alt="Configuration 2" width="400">
266+
267+
268+
## How the ML Workflow Interacts with AWS
269+
270+
### **1. Training (Local)**
271+
272+
The machine learning model is trained locally using:
273+
274+
``` bash
275+
python train.py
276+
```
277+
278+
This produces the following artifacts:
279+
280+
- `model.joblib`
281+
- `faiss_index/` folder
282+
- `my_model/` folder (Sentence Transformer model)
283+
284+
### **2. Artifact Storage (S3)**
285+
286+
Due to GitHub file-size limits and Docker build timeouts, heavy
287+
artifacts are stored in **Amazon S3**.
288+
289+
Artifacts stored:
290+
291+
- `daraz-code-mixed-product-reviews.csv`
292+
- `faiss_index/` (Vector Database)
293+
- `my_model/` (Sentence Transformer)
294+
295+
<img src="assets/S3_bucket.png" alt="S3_bucket after milestone 2" width="500">
212296
213-
### How the ML Workflow Interacts with AWS
297+
### **3. Inference (EC2)**
214298
215-
1. **Training (Local):** The model is trained locally using `python train.py`. This script also generates the `model.joblib` artifact.
216-
2. **Data/Model Storage (S3):** The `Top_Selling_Product_Data.csv` dataset and the final `model.joblib` artifact are manually uploaded to an **S3 bucket** for persistent, durable storage.
217-
3. **Inference (EC2):** An **EC2 instance** runs our Docker container. The container (built by the CI/CD pipeline) pulls the code, and when the API starts, it loads the model from its local `models/` directory (which was part of the build).
218-
4. **Monitoring (CloudWatch):** **CloudWatch** automatically monitors the EC2 instance (CPU, Network, etc.) to ensure the health of our inference API server.
299+
An **EC2 t3.micro** instance hosts the live FastAPI application.
219300
220-
### Services Used
301+
Instead of downloading models inside Docker (which causes timeouts), the
302+
instance downloads artifacts from S3 **at the host level**, and they are
303+
mounted into the container.
221304
222-
* **S3 (Simple Storage Service):** Used for persistent data storage.
223-
* **Why:** S3 is a highly durable and scalable service perfect for storing project artifacts.
224-
* **How:** The `Top_Selling_Product_Data.csv` dataset and the trained `model.joblib` are stored in an S3 bucket.
225-
* <img src="assets/Annonated_S3_bucket.png" alt="S3 Bucket" width="500">
305+
<img src="assets/Ec2 Instance.png" alt="Ec2 Instance after milestone 2" width="500">
226306
227-
* **EC2 (Elastic Compute Cloud):** Used to host the live inference API.
228-
* **Why:** An EC2 `t3.micro` instance provides a free-tier-eligible virtual server to run our Docker container 24/7.
229-
* **How:** The API's Docker image (built by our CI/CD pipeline and stored in GHCR) is pulled and run on an Amazon Linux EC2 instance.
230-
* <img src="assets/Annonated_Instance.png" alt="Instance Created" width="500">
231-
* <img src="assets/Annonated_API_running_on_AWS.png" alt="API running on AWS" width="500">
307+
### **4. Monitoring (CloudWatch)**
232308
233-
* **CloudWatch:** Used for basic infrastructure monitoring.
234-
* **Why:** CloudWatch is automatically integrated with EC2 and provides essential metrics (CPU, Network, Disk) to monitor the health and performance of our API server.
235-
* <img src="assets/Annonated_CloudWatch_Monitoring.png" alt="CloudWatch Monitoring" width="500">
309+
CloudWatch is used to track:
236310
237-
### Reproduction Steps & Security
311+
- CPU usage\
312+
- Memory usage
238313
239-
1. Launch a `t2.micro` EC2 instance with an Amazon Linux AMI.
240-
2. **Configure the Security Group:** To follow security best practices, access is restricted.
241-
* Allow inbound TCP traffic on port `22` (SSH) from `My IP`.
242-
* Allow inbound TCP traffic on port `8000` (API) from `My IP`.
243-
3. Connect to the instance via SSH.
244-
4. Install Docker: `sudo yum update -y && sudo yum install docker -y && sudo service docker start && sudo usermod -a -G docker ec2-user`
245-
5. Log out and log back in.
246-
6. Log in to GHCR: `docker login ghcr.io -u <YOUR-GITHUB-USERNAME>` (use a PAT as the password).
247-
7. Pull and run the image: `docker pull ghcr.io/zee404-code/daraz-insight-copilot:latest`
248-
8. Run: `docker run -d -p 8000:8000 ghcr.io/zee404-code/daraz-insight-copilot:latest`
249-
9. Access the API at `http://<YOUR_EC2_PUBLIC_IP>:8000/docs`.
314+
## AWS Services Used
250315
316+
### **S3 --- Simple Storage Service**
317+
318+
Stores heavy ML artifacts and acts as the "teleporter" between local
319+
development and the EC2 server.
320+
321+
### **EC2 --- Elastic Compute Cloud**
322+
323+
Runs the production API using Docker with host networking.
324+
325+
### **GHCR --- GitHub Container Registry**
326+
327+
Stores the lightweight Docker image.
328+
329+
# 🚀 Deployment Guide (Full Reproduction Steps)
330+
331+
## **1. Infrastructure Setup**
332+
333+
### **S3**
334+
335+
Create a bucket (example: `daraz-insight-artifacts-fz`) and upload:
336+
337+
- `reviews.csv`
338+
- `faiss_index/`
339+
- `my_model/`
340+
341+
### **EC2**
342+
343+
- Start a **t3.micro** instance\
344+
- OS: Amazon Linux 2023
345+
346+
### **Security Groups**
347+
348+
Port Purpose Source
349+
------ ------------ -----------
350+
22 SSH Your IP
351+
8000 API Access 0.0.0.0/0
352+
353+
## **2. Server Configuration**
354+
355+
``` bash
356+
sudo yum update -y
357+
sudo yum install docker -y
358+
sudo service docker start
359+
sudo usermod -a -G docker ec2-user
360+
```
361+
362+
### **Create Swap Memory**
363+
364+
``` bash
365+
sudo dd if=/dev/zero of=/swapfile bs=128M count=32
366+
sudo chmod 600 /swapfile
367+
sudo mkswap /swapfile
368+
sudo swapon /swapfile
369+
exit
370+
```
371+
372+
## **3. Artifact Injection**
373+
374+
``` bash
375+
aws configure
376+
aws s3 cp s3://daraz-insight-artifacts-fz/daraz-code-mixed-product-reviews.csv reviews.csv
377+
aws s3 cp s3://daraz-insight-artifacts-fz/faiss_index/ faiss_index/ --recursive
378+
aws s3 cp s3://daraz-insight-artifacts-fz/my_model/ my_model/ --recursive
379+
```
380+
381+
## **4. Run the Container**
382+
383+
``` bash
384+
docker login ghcr.io -u <YOUR_GITHUB_USERNAME>
385+
docker pull ghcr.io/zee404-code/daraz-insight-copilot:latest
386+
```
387+
388+
``` bash
389+
docker run -d \
390+
--network host \
391+
--dns 8.8.8.8 \
392+
-v "$(pwd)/reviews.csv:/app/reviews.csv" \
393+
-v "$(pwd)/faiss_index:/app/faiss_index" \
394+
-v "$(pwd)/my_model:/app/my_model" \
395+
-e GROQ_API_KEY="<YOUR_GROQ_API_KEY>" \
396+
-e TRANSFORMERS_OFFLINE=1 \
397+
-e HF_HUB_OFFLINE=1 \
398+
-e SENTENCE_TRANSFORMERS_HOME="/app/my_model" \
399+
--name api-app \
400+
ghcr.io/zee404-code/daraz-insight-copilot:latest
401+
```
402+
403+
# ✅ Verification
404+
405+
Visit:
406+
407+
http://<EC2_PUBLIC_IP>:8000/docs
408+
409+
You should see the live Swagger documentation.
410+
411+
<img src="assets/Artifact.png" alt="Artifact after milestone 2" width="500">
412+
<br>
251413
252414
## Make Targets
253415

SECURITY.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
# Security Policy
2+
3+
## Prompt Injection Defenses
4+
5+
This application processes user input through Large Language Models (LLMs). To mitigate **Prompt Injection** attacks (where malicious input attempts to override system instructions), we have implemented multiple layers of defense:
6+
7+
1. **System Prompt Encapsulation**
8+
- All user input is strictly encapsulated using clear delimiters (e.g., `"""User Query"""`) before being passed to the LLM.
9+
- The system prompt explicitly instructs the model to respond **only** based on the provided retrieved context and Daraz product information.
10+
11+
2. **Context Grounding (RAG-based)**
12+
- The model is forbidden from hallucinating or using external knowledge.
13+
- Responses must cite specific reviews or product data retrieved from the `faiss_index`.
14+
- If the retrieved context is irrelevant or insufficient, the model falls back to a safe response such as “I cannot answer that.”
15+
16+
3. **Input Sanitization**
17+
- All incoming API requests are validated using Pydantic models to enforce correct data types, formats, and length limits, preventing injection via malformed payloads.
18+
19+
## Data Privacy & Handling
20+
21+
We prioritize user privacy and minimize data retention:
22+
23+
1. **No Permanent Storage of Queries**
24+
- Queries sent to the `/ask` endpoint are processed entirely in memory.
25+
- User queries are **never** persisted to databases, S3 buckets, or permanent logs. Server logs are ephemeral and automatically rotated.
26+
27+
2. **PII Redaction**
28+
- Current dataset (`reviews.csv`) contains only public product reviews; any potential PII in the source data is considered public domain.
29+
- **Planned**: Automated detection and redaction of PII in incoming user queries in future releases.
30+
31+
3. **API Key Security**
32+
- LLM provider API keys (Groq, Gemini, etc.) are injected exclusively as environment variables at runtime.
33+
- Keys are **never** hardcoded or committed to the repository.
34+
35+
## Reporting Security Vulnerabilities
36+
37+
If you discover a security vulnerability, a bypass of the implemented guardrails, or any other security issue:
38+
39+
**Please do not open a public GitHub issue.**
40+
41+
Instead, report it privately by emailing the repository maintainer directly. Responsible disclosures will be acknowledged and addressed promptly.
42+
43+
Thank you for helping keep this project secure!

assets/Artifact.png

49.6 KB
Loading

assets/Ec2 Instance.png

108 KB
Loading

assets/S3_bucket.png

5.65 KB
Loading

assets/conf1.png

31.4 KB
Loading

assets/conf2.png

42.9 KB
Loading

0 commit comments

Comments
 (0)