A comprehensive Content-Based Image Retrieval (CBIR) system comparing CNN architectures (VGG16, ResNet50, MobileNetV2, Custom CNN) and distance metrics (Euclidean, Manhattan, Bray-Curtis, Canberra) on the UKBench dataset.
- Overview
- Features
- System Architecture
- Project Structure
- Technical Details
- Results
- Installation
- Usage
- License
- Acknowledgements
This project implements a complete Content-Based Image Retrieval (CBIR) system designed to compare different deep learning approaches for image similarity search. The system consists of two main components:
- Offline Pipeline (
main.ipynb): Feature extraction and model evaluation - Online Deployment (
image_retrieval_app/): Flask web application for real-time retrieval
| Task | Status |
|---|---|
| Develop DL-based CBIR system (Pre-trained & Custom CNN) | ✅ |
| Use UKBench dataset (10,200 images) | ✅ |
| Implement multiple distance metrics | ✅ |
| Evaluate using Precision, Recall, F1-Score, and mAP | ✅ |
| Visualize performance using APR plots | ✅ |
| Display visual retrieval results | ✅ |
| Deploy model using Flask | ✅ |
- Multiple CNN Architectures: VGG16, ResNet50, MobileNetV2, and Custom CNN
- Four Distance Metrics: Euclidean, Manhattan, Bray-Curtis, and Canberra
- Comprehensive Evaluation: mAP, Precision@K, Recall@K, F1-Score@K
- Web Interface: User-friendly Flask application with dark mode UI
- Real-time Retrieval: Fast similarity search using pre-computed features
- Visual Analysis: Automated generation of performance plots and retrieval examples
graph TD
subgraph "Offline Pipeline (main.ipynb)"
A[UKBench Dataset<br/>10,200 images] --> B[Train Custom CNN<br/>on CIFAR-10]
A --> C{Feature Extraction}
B --> C
C --> D[VGG16]
C --> E[ResNet50]
C --> F[MobileNetV2]
C --> G[Custom CNN]
D --> H[Save Features<br/>.npy files]
E --> H
F --> H
G --> H
D --> I[Save Models<br/>.keras files]
E --> I
F --> I
G --> I
end
subgraph "Online Deployment (app.py)"
J[User Upload] --> K[Select Model & Metric]
K --> L[Load Features & Model]
J --> M[Extract Query Features]
M --> N[Calculate Distances]
L --> N
N --> O[Rank Top-K Results]
O --> P[Display Results]
end
.
├── image_retrieval_app/
│ ├── app.py # Flask application
│ ├── static/uploads/ # Query image storage
│ └── templates/
│ ├── base.html # Base template (dark mode)
│ ├── index.html # Upload page
│ └── results.html # Results page
│
├── notebook/
│ └── main.ipynb # Experimentation notebook
│
├── Models/
│ ├── vgg16_extractor.keras
│ ├── resnet50_extractor.keras
│ ├── mobilenetv2_extractor.keras
│ └── custom_cnn_extractor.keras
│
├── extracted_features/
│ ├── features_VGG16.npy
│ ├── features_ResNet50.npy
│ ├── features_MobileNetV2.npy
│ └── features_CustomCNN.npy
│
├── ukbench/
│ └── ukbench/full/ # Dataset (10,200 images)
│
├── Image/
│ ├── Flask/ # App screenshots
│ └── *.png # Generated plots
│
├── requirements.txt
├── LICENSE
└── README.md
Four different architectures are compared to evaluate various design philosophies:
| Model | Type | Pre-trained On | Feature Dimension | Description |
|---|---|---|---|---|
| VGG16 | Pre-trained | ImageNet | 512 | Deep network with uniform architecture |
| ResNet50 | Pre-trained | ImageNet | 2048 | Residual connections for deeper networks |
| MobileNetV2 | Pre-trained | ImageNet | 1280 | Efficient mobile-optimized architecture |
| Custom CNN | Custom | CIFAR-10 | 4096 | Trained from scratch, no ImageNet bias |
All pre-trained models use GlobalAveragePooling2D after removing the classification head (include_top=False).
Four distance metrics are implemented to compare different similarity measures:
Standard L2 norm, sensitive to magnitude:
d(v₁, v₂) = √(Σᵢ(v₁ᵢ - v₂ᵢ)²)
L1 norm, less sensitive to outliers:
d(v₁, v₂) = Σᵢ|v₁ᵢ - v₂ᵢ|
Bounded between [0,1], effective for non-negative features:
d(v₁, v₂) = Σᵢ|v₁ᵢ - v₂ᵢ| / Σᵢ(v₁ᵢ + v₂ᵢ)
Weighted Manhattan, sensitive to small values:
d(v₁, v₂) = Σᵢ(|v₁ᵢ - v₂ᵢ| / (|v₁ᵢ| + |v₂ᵢ|))
Performance is evaluated at K=4 (matching UKBench's 4 images per object):
- Mean Average Precision (mAP): Primary metric for ranking quality
- Precision@K: Proportion of relevant images in top-K results
- Recall@K: Proportion of all relevant images retrieved
- F1-Score@K: Harmonic mean of Precision and Recall
The following table summarizes the evaluation results for each model and metric combination:
| Model | Metric | mAP | Precision@4 | Recall@4 | F1-Score@4 |
|---|---|---|---|---|---|
| CustomCNN | BrayCurtis | 0.499334527207131 | 0.36823529411764705 | 0.49098039215686273 | 0.4208403361344539 |
| CustomCNN | Canberra | 0.4912541448460369 | 0.36333333333333334 | 0.48444444444444434 | 0.41523809523809524 |
| CustomCNN | Euclidean | 0.46379544511799803 | 0.33950980392156865 | 0.4526797385620915 | 0.38801120448179266 |
| CustomCNN | Manhattan | 0.4953657994028185 | 0.36519607843137253 | 0.48692810457516333 | 0.41736694677871145 |
| MobileNetV2 | BrayCurtis | 0.9147488154633359 | 0.6798039215686275 | 0.9064052287581699 | 0.7769187675070027 |
| MobileNetV2 | Canberra | 0.89382098194935 | 0.6631372549019607 | 0.8841830065359476 | 0.7578711484593835 |
| MobileNetV2 | Euclidean | 0.9175649327762008 | 0.6826470588235294 | 0.9101960784313725 | 0.7801680672268906 |
| MobileNetV2 | Manhattan | 0.9089525293724029 | 0.6741176470588235 | 0.8988235294117647 | 0.7704201680672268 |
| ResNet50 | BrayCurtis | 0.9234832227854464 | 0.6863725490196079 | 0.9151633986928106 | 0.7844257703081231 |
| ResNet50 | Canberra | 0.9207146131729339 | 0.6850980392156862 | 0.9134640522875818 | 0.7829691876750698 |
| ResNet50 | Euclidean | 0.9269750512244008 | 0.6901960784313725 | 0.9202614379084967 | 0.7887955182072828 |
| ResNet50 | Manhattan | 0.9270064748611154 | 0.6895098039215686 | 0.9193464052287582 | 0.7880112044817926 |
| VGG16 | BrayCurtis | 0.8957627564608902 | 0.6635294117647059 | 0.8847058823529412 | 0.7583193277310922 |
| VGG16 | Canberra | 0.8607408847463924 | 0.6371568627450981 | 0.8495424836601306 | 0.7281792717086834 |
| VGG16 | Euclidean | 0.8954944959603279 | 0.6643137254901961 | 0.885751633986928 | 0.7592156862745097 |
| VGG16 | Manhattan | 0.8919642373188228 | 0.6605882352941177 | 0.8807843137254902 | 0.7549579831932771 |
All plots are automatically generated by main.ipynb and saved to the /Image directory:
-
mAP Bar Chart: Comparison across all models and metrics
-
Precision@4 Bar Chart
-
Recall@4 Bar Chart
-
F1-Score@4 Bar Chart
-
APR Curves (Precision-Recall trade-off visualization for each metric)
-
Retrieval Examples: Qualitative results with query and top-K matches (beberapa contoh representatif dari subfolder
retrieval_results/)
The evaluation demonstrates the trade-offs between:
- Model complexity (parameters vs. performance)
- Feature dimensionality (computational cost vs. accuracy)
- Distance metrics (sensitivity vs. robustness)
The deployment provides an intuitive interface for real-time image retrieval:
| Upload Page | Results Page |
|---|---|
![]() |
![]() |
- Python 3.10 or higher
- pip package manager
- Jupyter Notebook (for experimentation)
git clone https://github.com/your-username/image-retrieval-system.git
cd image-retrieval-systempip install -r requirements.txtDownload the UKBench Dataset and place the full/ directory (containing 10,200 .jpg files) into:
ukbench/ukbench/full/
-
Navigate to the notebook directory:
cd notebook -
Launch Jupyter Notebook:
jupyter notebook main.ipynb
-
Run all cells to:
- Train Custom CNN on CIFAR-10
- Extract features for all images using all models
- Save models to
/Modelsdirectory - Save features to
/extracted_featuresdirectory - Generate evaluation plots in
/Imagedirectory
-
Navigate to the app directory:
cd image_retrieval_app -
Start the Flask server:
python app.py
-
Open your browser and visit:
http://127.0.0.1:5000 -
Upload an image, select a model and distance metric, and retrieve similar images!
This project is licensed under the MIT License. See the LICENSE file for details.
- UKBench Dataset: Nister, D., & Stewenius, H. (2006). "Scalable recognition with a vocabulary tree." 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
- Pre-trained models from Keras Applications
- Flask web framework
- TensorFlow/Keras deep learning library
If you use this project in your research, please cite:
@misc{image_retrieval_system,
title={Deep Learning-Based Image Retrieval System},
author={Bayu Ardiyansyah},
year={2025},
publisher={GitHub},
url={https://github.com/RazerArdi/Design-and-Evaluation-of-a-Deep-Learning-Image-Retrieval-System-Using-CNNs-and-Distance-Metrics}
}Made for Computer Vision Research













