Phishing attacks remain one of the most common and effective cyber threats, often relying on deceptively crafted URLs to trick users into revealing sensitive information. This project implements an end-to-end phishing URL detection system using machine learning, exposed through a lightweight Flask web application.
Users can submit any URL and receive a real-time classification (legitimate or phishing) along with confidence scores.
Access the deployed application here:
https://phishing-url-detector-kn1b.onrender.com
Note: The application may take up to ~1 minute to load on first access due to cold start on free hosting.
- Machine learning–based phishing detection using URL text only
- TF-IDF feature extraction at both word-level and character-level
- Linear Support Vector Classifier (LinearSVC) with calibrated probability outputs
- Confidence-aware predictions displayed through the web interface
- Flask-based web interface for real-time interaction
- Deployment-ready for platforms such as Render or GitHub-hosted environments
- Labeled dataset of legitimate and phishing URLs
- Source: https://huggingface.co/datasets/pirocheto/phishing-url
The dataset contains a diverse mix of legitimate and phishing URLs and is commonly used as a benchmark for URL-based phishing detection tasks.
- TF-IDF Vectorization
- Word-level n-grams
- Character-level n-grams (effective for obfuscated and shortened URLs)
- Classifier
- Linear Support Vector Machine (LinearSVC)
- Calibration
- Probability calibration applied to enable meaningful confidence estimates
The trained model is serialized and loaded at runtime for efficient inference.
phishing-url-detector/
├── app.py # Main Flask server
├── templates/
│ └── index.html # Frontend HTML
├── model/
│ └── model.pkl # Trained machine learning model
├── helpers.py # Model training and utilities
├── requirements.txt # Python dependencies
└── README.md # Project documentation
- Python 3.8 or higher
- pip for dependency installation
Clone the repository and install dependencies:
git clone https://github.com/yourusername/phishing-url-detector.git
cd phishing-url-detector
pip install -r requirements.txtRun the web app:
python app.pyThen open http://127.0.0.1:5000 in your browser.