SMS Spam Classifier 📱 🚫

AI-Powered Message Filtering with NLP

🚀 Overview

SMS Spam Classifier is a machine learning web application that detects spam messages with high accuracy using Natural Language Processing (NLP) techniques. Built with Streamlit, it provides an intuitive interface where users can paste any SMS or email text and instantly receive a spam/not-spam prediction.

The model leverages TF-IDF vectorization and a trained classification algorithm to analyze message patterns, making it effective at identifying phishing attempts, promotional spam, and fraudulent messages.

📖 About

The Problem: Spam messages clutter our inboxes, waste time, and pose security risks through phishing and scams. Manual filtering is tedious, and simple keyword-based filters are easily bypassed by sophisticated spam techniques.

The Solution: This classifier uses machine learning trained on thousands of real SMS messages to identify spam with high precision. By analyzing linguistic patterns, word frequencies, and message structure, it can detect spam even when traditional filters fail.

The Vision: To provide an accessible, lightweight spam detection tool that anyone can deploy locally or in the cloud. Whether you're building a messaging app, email client, or just want to test suspicious messages, this classifier offers production-ready spam detection.

✨ Key Features

🤖 Machine Learning Classification

Pre-trained Model: Ready-to-use classifier trained on real SMS spam dataset.
High Accuracy: Achieves strong performance on spam detection tasks.
Binary Classification: Clear spam/not-spam predictions.

🔤 Advanced NLP Pipeline

Text Preprocessing: Lowercasing, tokenization, and punctuation removal.
Stopword Filtering: Removes common words that don't contribute to spam detection.
Stemming: Reduces words to their root form using Porter Stemmer.
TF-IDF Vectorization: Converts text to numerical features for ML model.

🌐 Interactive Web Interface

Streamlit UI: Clean, responsive interface built with Streamlit.
Real-time Predictions: Instant classification results.
Easy Deployment: Can be deployed to Streamlit Cloud, Heroku, or any cloud platform.

🛠️ Technical Stack

Machine Learning

Algorithm: Classification model (stored in model.pkl)
Vectorization: TF-IDF (Term Frequency-Inverse Document Frequency)
Library: scikit-learn

Natural Language Processing

Tokenization: NLTK word tokenizer
Stemming: Porter Stemmer
Stopwords: NLTK English stopwords corpus

Web Framework

Frontend: Streamlit
Backend: Python 3.8+
Deployment: Configured for Heroku (Procfile, setup.sh)

🏗️ Architecture

The application follows a simple yet effective pipeline:

User Input: User enters SMS/email text in the Streamlit interface
Preprocessing: Text is cleaned, tokenized, and stemmed
Vectorization: Processed text is converted to TF-IDF features
Prediction: ML model classifies the message as spam or not spam
Display: Result is shown to the user in real-time

🏁 Getting Started

Prerequisites

Python 3.8 or higher
pip (Python package manager)

Installation

Clone the repository

git clone https://github.com/sriram9573/sms-spam-classifier.git
cd sms-spam-classifier

Install dependencies
```
pip install -r requirements.txt
```

Download NLTK data

python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords')"

Run the application
```
streamlit run app.py
```
Access the app

Open your browser and navigate to http://localhost:8501

🎯 Usage

Launch the app using streamlit run app.py
Enter a message in the text area (SMS or email content)
Click "Predict" to classify the message
View the result: The app will display "Spam" or "Not Spam"

Example Messages to Test:

Spam:

URGENT! You've won a $1000 Walmart gift card. Click here to claim: http://bit.ly/scam123

Not Spam:

Hey, are we still meeting for lunch at 1pm today?

📊 Model Details

Training Data: spam.csv - SMS Spam Collection dataset
Features: TF-IDF vectors extracted from preprocessed text
Model: Pre-trained classifier (stored in model.pkl)
Vectorizer: Pre-fitted TF-IDF vectorizer (stored in vectorizer.pkl)

The Jupyter notebook sms-spam-detection.ipynb contains the full training pipeline and model evaluation metrics.

🚀 Deployment

Deploy to Streamlit Cloud

Push your code to GitHub
Go to share.streamlit.io
Connect your repository
Deploy!

Deploy to Heroku

The repository includes Heroku configuration files:

Procfile: Defines the web process
setup.sh: Streamlit configuration script

heroku create your-app-name
git push heroku main

🚀 Future Roadmap

Multi-language support
Confidence score display
Model retraining interface
API endpoint for programmatic access
Browser extension integration
Real-time email client integration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMS Spam Classifier 📱 🚫

🚀 Overview

📖 About

✨ Key Features

🤖 Machine Learning Classification

🔤 Advanced NLP Pipeline

🌐 Interactive Web Interface

🛠️ Technical Stack

Machine Learning

Natural Language Processing

Web Framework

🏗️ Architecture

🏁 Getting Started

Prerequisites

Installation

🎯 Usage

Example Messages to Test:

📊 Model Details

🚀 Deployment

Deploy to Streamlit Cloud

Deploy to Heroku

🚀 Future Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
app.py		app.py
model.pkl		model.pkl
nltk.txt		nltk.txt
requirements.txt		requirements.txt
setup.sh		setup.sh
sms-spam-detection.ipynb		sms-spam-detection.ipynb
spam.csv		spam.csv
vectorizer.pkl		vectorizer.pkl

sriram9573/sms-spam-classifier

Folders and files

Latest commit

History

Repository files navigation

SMS Spam Classifier 📱 🚫

🚀 Overview

📖 About

✨ Key Features

🤖 Machine Learning Classification

🔤 Advanced NLP Pipeline

🌐 Interactive Web Interface

🛠️ Technical Stack

Machine Learning

Natural Language Processing

Web Framework

🏗️ Architecture

🏁 Getting Started

Prerequisites

Installation

🎯 Usage

Example Messages to Test:

📊 Model Details

🚀 Deployment

Deploy to Streamlit Cloud

Deploy to Heroku

🚀 Future Roadmap

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages