Classical Feature-Based Image Classification Pipeline

A classical (non-deep-learning) machine learning pipeline for image classification built on the Intel Image Classification dataset. The pipeline covers the full lifecycle — raw data → preprocessing → HOG feature extraction → classifier training → evaluation → deployment & simulation.

Overview

This project explores how far classical machine learning can go on a real-world image classification task without any deep learning. It implements three classifiers from scratch or via scikit-learn, all fed with HOG (Histogram of Oriented Gradients) feature descriptors extracted from preprocessed images.

Demo

Dataset

Intel Image Classification — available on Kaggle

Split	Folder	Images	Labels
Train	`data/raw/seg_train/`	~14,000	✅ 6 classes
Test	`data/raw/seg_test/`	~3,000	✅ 6 classes
Predict	`data/raw/seg_pred/`	~7,300	❌ unlabelled

Classes: buildings · forest · glacier · mountain · sea · street

Project Structure

├── config.yaml                  # Central configuration (image size, HOG params, classifiers)
├── requirements.txt
│
├── data/
│   └── raw/
│       ├── seg_train/           # Labelled training images (class subfolders)
│       ├── seg_test/            # Labelled test images (class subfolders)
│       └── seg_pred/            # Unlabelled images for inference
│
├── src/
│   ├── preprocessing/
│   │   ├── image_preprocessor.py   # Resize → grayscale → normalise
│   │   └── data_loader.py          # Scan splits, train/val split
│   ├── features/
│   │   └── hog_extractor.py        # HOG feature extraction (skimage)
│   ├── classifiers/
│   │   ├── svm_classifier.py       # SVM (linear / RBF) via sklearn
│   │   ├── logistic_regression.py  # Logistic Regression via sklearn
│   │   └── knn_classifier.py       # KNN from scratch (euclidean / manhattan)
│   └── evaluation/
│       └── evaluator.py            # Accuracy, classification report, confusion matrix
│
├── model/
│   ├── model.ipynb              # ⭐ Preprocess + train final SVM-RBF → saves svm_rbf.pkl
│   └── svm_rbf.pkl              # ⚠️ gitignored — generate locally by running model.ipynb
│
├── notebooks/
│   └── modelComparing.ipynb    # Compare all classifiers on held-out test set
│
└── experiments/
    ├── simulate.ipynb           # ⭐ Interactive single-image prediction simulator
    ├── predict.py               # Batch inference on all seg_pred/ images → predictions.csv
    └── predictions.csv          # ⚠️ gitignored — generate locally by running predict.py

Pipeline

Raw Images (seg_train + seg_test)
        │
        ▼
ImagePreprocessor
  • Resize to 128×128
  • Convert to grayscale
  • Normalise pixels to [0, 1]
        │
        ▼
HOGExtractor
  • Orientations : 9
  • Pixels/cell  : 8×8
  • Cells/block  : 2×2
  • Output dim   : 3,969 features per image
        │
        ▼
Classifier (SVM RBF — best performer)
        │
        ▼
Predictions / Evaluation

Classifiers

Classifier	Variant	Accuracy	Notes
SVM	`kernel='rbf'`, `C=1.2`	76.23%	⭐ Best overall — selected as final model
SVM	`kernel='linear'`, `C=1.2`	64.83%	Good baseline
Logistic Regression	`C=0.01`	70.97%	Fast, interpretable
KNN + PCA	`k=21`, euclidean, 22 components	70.13%	Dimensionality-reduced variant
KNN	`k=5`, manhattan	49.40%	Custom from-scratch implementation

Full comparison is in notebooks/modelComparing.ipynb.

Results

The SVM with RBF kernel (C=1.2) achieved the best performance and was selected as the final deployed model.

Setup

1. Clone the repository

git clone https://github.com/ghosteater1311/Classical-Feature-Based-Image-Classification-Pipeline.git
cd Classical-Feature-Based-Image-Classification-Pipeline

2. Install dependencies

pip install -r requirements.txt

3. Download the dataset

Download from Kaggle and place the folders so the structure matches data/raw/seg_train/, data/raw/seg_test/, data/raw/seg_pred/.

Usage

Step 1 — Train the final model

Open and run all cells in model/model.ipynb.

This will:

Scan all images from seg_train/ + seg_test/
Preprocess and extract HOG features
Train SVM(kernel='rbf', C=1.2) on the full labelled dataset
Save the model to model/svm_rbf.pkl

Step 2 — Batch predict on `seg_pred/`

python experiments/predict.py

Outputs experiments/predictions.csv with columns: filename, predicted_label, predicted_class_id.

Step 3 — Interactive simulation

Open experiments/simulate.ipynb, set IMAGE_NAME to any filename from seg_pred/, and run the cells to see:

The original and preprocessed image side by side
The predicted class with confidence percentage
A probability bar chart across all 6 classes

License

This project is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

You are free to share and adapt this work, even commercially, as long as you give appropriate credit and distribute any derivative works under the same license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classical Feature-Based Image Classification Pipeline

📋 Table of Contents

Overview

Demo

Dataset

Project Structure

Pipeline

Classifiers

Results

Setup

Usage

Step 1 — Train the final model

Step 2 — Batch predict on `seg_pred/`

Step 3 — Interactive simulation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data/raw		data/raw
experiments		experiments
model		model
notebooks		notebooks
notes		notes
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Classical Feature-Based Image Classification Pipeline

📋 Table of Contents

Overview

Demo

Dataset

Project Structure

Pipeline

Classifiers

Results

Setup

Usage

Step 1 — Train the final model

Step 2 — Batch predict on seg_pred/

Step 3 — Interactive simulation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 2 — Batch predict on `seg_pred/`

Packages