A simple web-based application built using Flask and Machine Learning to detect whether a given SMS message is Spam or Ham (Not Spam).
- ✅ Predicts whether an SMS is spam or not
⚠️ Flash alert for spam messages- 🧼 Clean text preprocessing using NLTK
- 🔍 Real-time text prediction using a trained ML model
- 🔄 Refresh/reset button
- 🖥️ Simple, clean Bootstrap UI
- 🔒 Lightweight and secure Flask backend
- Input: User enters an SMS message on the web interface.
- Preprocessing: Message is lowercased, cleaned of punctuation, stopwords are removed, and stemming is applied.
- Vectorization: Cleaned text is transformed using a trained TF-IDF vectorizer.
- Prediction: The Multinomial Naive Bayes model predicts if the message is spam or ham.
- Output: The prediction is shown with a flash message and result card.
spam_sms_detection_project/ ├── app/ │ ├── app.py # Flask application │ ├── templates/ │ │ └── index.html # Frontend ├── model/ │ ├── spam_classifier.pkl # Trained ML model │ └── tfidf_vectorizer.pkl # TF-IDF Vectorizer ├── spam.csv # Dataset (optional) ├── requirements.txt └── README.md
Clone the repository:
git clone https://github.com/AnkithaMadhyastha/spam-detector-app.git cd spam-detector-app
The system uses the SMS Spam Collection Dataset, a labeled dataset of SMS messages marked as either spam or ham. It is widely used for NLP tasks and is included in the project as spam.csv.
Columns:
label– spam or hammessage– text of the SMS
Hey! Just checking in. Are we still on for lunch today?
Congratulations! You’ve won a free vacation to Bahamas! Call now to claim.
When a message is entered, the app returns:
- ✅ Ham – if the message is safe
- ❌ Spam – if it's a spam message
A flash message and label appear on the web page accordingly.
Snapsshot - output in this github project ,which is named has s1.png and s2.png
The model was trained using:
- Algorithm:
Multinomial Naive Bayes - Vectorizer:
TfidfVectorizer - Text Cleanup:
- Lowercasing
- Removing punctuation
- Removing stopwords
- Applying stemming (
PorterStemmer)
Generate the spam_classifier.pkl and tfidf_vectorizer.pkl model training script for you too.
This project is licensed under the MIT License.
You are free to use, modify, and distribute it with credit to the author.