An end-to-end ML pipeline that detects phishing attacks in network traffic — from raw data ingestion to real-time predictions — wrapped in a stunning, modern web interface.
Live Demo · Get Started · API Docs · MLflow Dashboard
Sentinel analyzes network traffic features and classifies each data point as legitimate or phishing using machine learning. The system automates the entire journey:
📥 Data Ingestion → ✅ Validation → 🔄 Transformation → 🤖 Training → 🎯 Prediction
| Feature | Description |
|---|---|
| 5 ML Models | Random Forest, Gradient Boosting, Decision Tree, Logistic Regression, AdaBoost |
| Auto-Tuning | Hyperparameter tuning via GridSearchCV across all models |
| Best Model Selection | Automatically picks the highest-scoring classifier |
| Drift Detection | Schema validation + feature drift reports per training run |
| Experiment Tracking | Every run logged to MLflow with F1, Precision, Recall metrics |
| Web Interface | Upload CSV → get predictions, or trigger training from the browser |
| REST API | Programmatic /train and /predict endpoints |
┌─────────────────────────────────────────┐
│ SENTINEL WEB APPLICATION │
│ FastAPI + Jinja2 + Sentinel UI │
└──────────────┬──────────────────────────┘
│
┌──────────────────────────────────────────────────────────────┐
│ ML TRAINING PIPELINE │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ 01 Data │──▶│ 02 Data │──▶│ 03 Data │──▶│ 04 Model │ │
│ │ Ingestion │ │Validation│ │Transform │ │ Training │ │
│ │ │ │ │ │ │ │ │ │
│ │ MongoDB │ │ Schema + │ │ KNN │ │5 Models +│ │
│ │ → CSV │ │ Drift │ │ Imputer │ │ MLflow │ │
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
└──────────────────────────────────────────────────────────────┘
│
┌──────────────┴──────────────┐
│ │
┌────┴────┐ ┌─────┴─────┐
│ MongoDB │ │ MLflow │
│ Atlas │ │ DagsHub │
└─────────┘ └───────────┘
|
|
- Python 3.10+ · Git · MongoDB (Atlas Free Tier)
git clone https://github.com/its-me-meax/networksecurity.git
cd networksecurity
python -m venv venv
# Windows
venv\Scripts\activate
# macOS / Linux
source venv/bin/activate
pip install -r requirements.txtCreate a .env file in the project root:
# MongoDB (required)
MONGODB_URL_KEY=mongodb+srv://<user>:<pass>@<cluster>.mongodb.net/?retryWrites=true&w=majority
MONGO_DB_URL=mongodb+srv://<user>:<pass>@<cluster>.mongodb.net/?retryWrites=true&w=majority
# MLflow / DagsHub (optional — for experiment tracking)
MLFLOW_TRACKING_URI=https://dagshub.com/<username>/networksecurity.mlflow
MLFLOW_TRACKING_USERNAME=<dagshub-username>
MLFLOW_TRACKING_PASSWORD=<dagshub-token># Seed MongoDB with the phishing dataset
python push_data.py
# Option A: Run training pipeline (CLI)
python main.py
# Option B: Launch the web app
python app.py
# → Open http://localhost:8080| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
🏠 Dashboard — Sentinel landing page |
GET |
/analyze |
📊 Upload page — CSV file upload for predictions |
GET |
/train-model |
🏋️ Training page — trigger pipeline from the UI |
GET |
/train |
⚡ API — Runs the full training pipeline |
POST |
/predict |
🎯 API — Upload CSV → get phishing predictions |
curl -X POST "http://localhost:8080/predict" -F "file=@network_data.csv"Returns an HTML table: each row annotated with predicted_column → 0 = safe, 1 = phishing.
Connects to MongoDB Atlas, exports the
NetworkDatacollection, and splits into 80/20 train/test sets.
Validates against
data_schema/schema.yaml. Generates a drift report to detect distribution shifts between training runs.
Applies KNN Imputer (k=3, uniform weights) to handle missing values. Saves the fitted preprocessor as a pickle artifact.
Trains 5 classifiers with hyperparameter tuning, selects the best, and logs everything to MLflow:
| Model | Tuned Parameters |
|---|---|
| Random Forest | n_estimators: [8, 16, 32, 128, 256] |
| Decision Tree | criterion: [gini, entropy, log_loss] |
| Gradient Boosting | learning_rate, subsample, n_estimators |
| Logistic Regression | Defaults |
| AdaBoost | learning_rate, n_estimators |
Selection criteria: Best score · Min threshold: 0.6 · Overfit tolerance: 0.05
Render offers a free tier with Docker support — zero cost, auto-deploy on push.
-
Push your code to GitHub
-
Sign up at render.com (free)
-
Click New → Web Service → connect your GitHub repo
-
Configure:
- Build Command:
pip install -r requirements.txt - Start Command:
uvicorn app:app --host 0.0.0.0 --port 10000 - Plan: Free
- Build Command:
-
Add Environment Variables:
Key Value MONGODB_URL_KEYYour MongoDB connection string MONGO_DB_URLYour MongoDB connection string MLFLOW_TRACKING_URIDagsHub MLflow URL MLFLOW_TRACKING_USERNAMEDagsHub username MLFLOW_TRACKING_PASSWORDDagsHub token -
Click Create Web Service → Done! 🎉
📍 Your app will be live at
https://<app-name>.onrender.com
🔄 Auto-deploys on every push tomain
💡 Tip: Free tier sleeps after ~15min idle. Use cron-job.org to ping every 14min to keep it awake.
# Build
docker build -t sentinel .
# Run
docker run -p 8080:8080 --env-file .env sentinel
# → http://localhost:8080All runs are traced with MLflow via DagsHub:
- Metrics: F1 Score, Precision, Recall (train & test)
- Model Registry: Best model registered as
NetworkSecurityModel - Dashboard: → Open MLflow UI
sentinel/
├── app.py # FastAPI web application
├── main.py # CLI pipeline runner
├── push_data.py # Seed MongoDB with CSV data
├── setup.py # Package config
├── requirements.txt # Dependencies
├── dockerfile # Docker config
├── render.yaml # Render deployment blueprint
│
├── networksecurity/ # Core ML package
│ ├── components/ # Pipeline stages
│ │ ├── data_ingestion.py
│ │ ├── data_validation.py
│ │ ├── data_transformation.py
│ │ └── model_trainer.py
│ ├── pipeline/ # Orchestration
│ │ └── training_pipeline.py
│ ├── entity/ # Config & artifact dataclasses
│ ├── constant/ # Hyperparameters & constants
│ ├── utils/ # Helpers (save/load, metrics)
│ ├── exception/ # Custom exception handling
│ └── logging/ # Logger configuration
│
├── static/ # Frontend assets
│ ├── css/style.css # Sentinel design system (1300+ lines)
│ └── js/dotgrid.js # Animated dot-grid background
│
├── templates/ # Jinja2 HTML templates
│ ├── base.html # Layout + navbar + theme toggle
│ ├── index.html # Dashboard
│ ├── analyze.html # CSV upload & prediction
│ ├── train.html # Training trigger with live steps
│ └── table.html # Prediction results
│
├── Network_Data/ # Raw phishing dataset (CSV)
├── data_schema/ # YAML schema definitions
├── assets/ # README screenshots
└── final_model/ # Saved model + preprocessor (.pkl)
| Variable | Required | Description |
|---|---|---|
MONGODB_URL_KEY |
✅ | MongoDB connection string |
MONGO_DB_URL |
✅ | MongoDB connection string |
MLFLOW_TRACKING_URI |
❌ | DagsHub MLflow tracking URL |
MLFLOW_TRACKING_USERNAME |
❌ | DagsHub username |
MLFLOW_TRACKING_PASSWORD |
❌ | DagsHub access token |
⚠️ Never commit.env— it's already in.gitignore.
# 1. Fork the repo
# 2. Create a feature branch
git checkout -b feature/amazing-feature
# 3. Commit your changes
git commit -m "Add amazing feature"
# 4. Push & open a PR
git push origin feature/amazing-featureThis project is licensed under the MIT License — see the LICENSE file for details.
Built with ❤️ by Pradyuman Sharma
⭐ Star this repo if you found it useful!
