Skip to content

Madhav-000-s/image-analysis-toolkit

Repository files navigation

Image Analysis Toolkit

A comprehensive collection of AI-powered computer vision tools for image analysis, search, classification, and privacy protection. Built with state-of-the-art deep learning models including CLIP, YOLOv8, and FAISS.

Author: Madhavendranath S Email: madhavendranaths@gmail.com


🎯 Overview

This toolkit provides production-ready implementations of common computer vision tasks, leveraging modern AI models for:

  • Image Metadata Extraction & OCR - Extract EXIF data, GPS coordinates, and text from images
  • Scene Classification - Classify images as indoor/outdoor and identify specific scene types
  • Semantic Image Search - Search images using natural language or reference images
  • Privacy Anonymization - Automatically detect and blur/pixelate people in images
  • Image Similarity Analysis - Robust similarity scoring across transformations
  • Object-Level Search - Find images containing specific objects

All modules are designed to be used independently or integrated into larger systems.


πŸ“‘ Table of Contents


✨ Features

πŸ” Intelligent Search & Classification

  • Zero-shot scene classification (indoor/outdoor + 20+ scene types)
  • Text-to-image search with natural language queries
  • Image-to-image similarity search
  • Object-level retrieval across large datasets

πŸ›‘οΈ Privacy & Security

  • Automatic human detection and anonymization
  • Multiple privacy modes (blur, mosaic, black-fill)
  • Face detection fallback for edge cases

πŸ“Š Analysis & Metrics

  • Comprehensive metadata extraction (EXIF, GPS, timestamps)
  • OCR with language detection
  • Multi-metric similarity scoring (pHash, SSIM, ORB)
  • Robustness testing across transformations

πŸš€ Performance

  • GPU acceleration support (CUDA/MPS)
  • Efficient caching systems
  • Batch processing capabilities
  • Production-optimized implementations

πŸš€ Quick Start

Prerequisites

  • Python 3.9 or higher
  • Virtual environment (recommended)
  • GPU recommended for optimal performance (optional)

Installation

# Clone the repository
git clone https://github.com/Madhav-000-s/image-analysis-toolkit.git
cd image-analysis-toolkit

# Navigate to desired module (example: scene classifier)
cd indoor-outdoor-classifier

# Create virtual environment
python -m venv .venv
source .venv/bin/activate    # Windows: .venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Quick Example

# Classify an image scene
python src/predict.py --image path/to/image.jpg --out_dir outputs

# Search images with text
cd ../search-images-with-text-or-image
python src/predict.py --text-search "sunset beach"

# Anonymize people in image
cd ../human-only-blur
python src/blur.py --to-blur path/to/image.jpg

🧰 Modules

1. Metadata Extractor & OCR

Location: image_metadata_showcase/

Extract comprehensive metadata from images including EXIF data, GPS coordinates, camera settings, timestamps, and embedded text.

Key Features:

  • EXIF/IPTC/XMP metadata parsing
  • GPS coordinate extraction and mapping
  • OCR with Tesseract (90+ languages)
  • Language detection
  • Camera settings analysis

How to Use:

  • Open the Colab notebook for interactive usage
  • Ideal for digital forensics, photo management, content verification

β†’ Full Documentation


2. Scene Classifier

Location: indoor-outdoor-classifier/

Zero-shot scene classification using CLIP (ViT-B/32) trained on LAION-2B. Classifies images as indoor/outdoor and identifies specific scene types (office, park, street, etc.).

Key Features:

  • Indoor vs outdoor classification
  • 20+ scene type recognition (customizable)
  • Confidence blending for improved accuracy
  • Batch processing support
  • Annotated preview generation

Usage:

# Single image
python src/predict.py --image path/to/image.jpg --out_dir outputs

# Batch processing
python src/predict.py --images_dir images/ --batch_size 8 --topk 3

Outputs: CSV with predictions + annotated preview images

β†’ Full Documentation


3. Image Search (Text/Image)

Location: search-images-with-text-or-image/

Semantic image search using CLIP embeddings. Find images using natural language descriptions or reference images.

Key Features:

  • Text-to-image search ("desert sunset", "busy street")
  • Image-to-image similarity search
  • Fast cached embedding system
  • Cosine similarity with softmax scoring
  • Support for animated GIFs

Usage:

# Text search
python src/predict.py --text-search "desert landscape" --topk 5

# Image search
python src/predict.py --image-search query.jpg --topk 5

# Rebuild index after adding images
python src/predict.py --reindex

Outputs: Ranked results with similarity scores + JSON export

β†’ Full Documentation


4. Privacy Anonymizer

Location: human-only-blur/

Automatically detect and anonymize people in images using YOLOv8 segmentation. Essential for GDPR compliance and privacy protection.

Key Features:

  • Precise human segmentation (not just bounding boxes)
  • Three anonymization modes: Gaussian blur, mosaic pixelation, black-fill
  • Adjustable blur strength and feathering
  • Face detection fallback (Haar cascade)
  • GPU acceleration

Usage:

# Basic blur
python src/blur.py --to-blur image.jpg

# Strong pixelation
python src/blur.py --to-blur image.jpg --mode mosaic --mosaic-tile 24

# Maximum privacy (black fill)
python src/blur.py --to-blur image.jpg --mode black

Outputs: Anonymized images in blurredimages/

β†’ Full Documentation


5. Similarity Analyzer

Location: image-similarity-suite/

Comprehensive image similarity testing with multiple metrics. Analyze robustness across rotations, scaling, compression, and filters.

Key Features:

  • Multi-metric scoring (pHash, dHash, aHash, SSIM, ORB)
  • Parametric testing (rotation angles, scale factors, JPEG quality)
  • Automatic plot generation
  • Preview montage grid
  • CSV export for analysis

Usage:

python src/similarity_suite.py --image path/to/image.jpg \
  --angles 1,10,17 \
  --scales 0.5,0.75,1.25 \
  --do-gray --do-blur --do-sharpen \
  --jpeg-qualities 95,80,60 \
  --preview-grid

Outputs:

  • results.csv - Similarity scores per transformation
  • plots/ - Metric vs parameter graphs
  • previews/ - Visual comparison grid

β†’ Full Documentation


6. Object Search

Location: object-level-image-search/

Advanced object-level image retrieval using YOLOv8 detection + CLIP embeddings + FAISS indexing. Find images containing similar objects.

Key Features:

  • Object detection with YOLOv8
  • Per-object CLIP embeddings
  • FAISS vector similarity search
  • Automatic region caching
  • IoU-based deduplication
  • Optional query-side detection

Usage:

# Build index (first time)
python src/script.py --reindex --image-folder images/

# Search for object
python src/script.py --search-object query.jpg --topk 10

# Advanced options
python src/script.py --search-object query.jpg \
  --detector yolov8s.pt \
  --min-conf 0.3 \
  --save-viz \
  --allow-multiple-per-image

Outputs: Ranked object matches + optional visualizations

β†’ Full Documentation


🧠 Models Used

Module Model(s) Purpose Size
Metadata Extractor Tesseract OCR Text extraction ~40MB
Scene Classifier CLIP ViT-B/32 (LAION-2B) Zero-shot classification ~600MB
Image Search CLIP ViT-B/32 (LAION-2B) Semantic embeddings ~600MB
Privacy Anonymizer YOLOv8-seg (nano) Human segmentation ~7MB
OpenCV Haar Cascade Face detection fallback <1MB
Similarity Analyzer ORB (OpenCV) Feature matching N/A (classical)
Object Search YOLOv8 (nano) Object detection ~6.5MB
CLIP ViT-B/32 Object embeddings ~600MB
FAISS Vector search N/A (library)

All models are automatically downloaded on first run.


πŸ’‘ Use Cases

Content Management & Media

  • Automated photo organization and tagging
  • Visual search for stock photo libraries
  • Duplicate image detection
  • Content moderation

Privacy & Compliance

  • GDPR-compliant image anonymization
  • Dataset preparation for machine learning
  • Public media privacy protection
  • Automated redaction pipelines

Research & Analysis

  • Digital forensics and metadata analysis
  • Image provenance verification
  • Computer vision benchmarking
  • Similarity testing and evaluation

E-commerce & Retail

  • Visual product search
  • Scene-based product recommendations
  • Similar item discovery
  • Catalog organization

Security & Surveillance

  • Scene classification for monitoring
  • Privacy-preserving video analytics
  • Object-based search in footage
  • Metadata extraction for evidence

πŸ“‹ Requirements

System Requirements

  • Python: 3.9 or higher
  • OS: Windows, Linux, macOS
  • RAM: 8GB minimum (16GB recommended for large batches)
  • Storage: 2-5GB for models and cache

GPU Support (Optional but Recommended)

  • NVIDIA GPU: CUDA 11.8+ (PyTorch with CUDA)
  • Apple Silicon: MPS acceleration (M1/M2/M3)
  • Performance: 5-10x faster inference with GPU

Python Dependencies

Each module has its own requirements.txt. Common dependencies:

  • PyTorch (torch, torchvision)
  • OpenCV (opencv-python)
  • Pillow (PIL)
  • NumPy, Pandas
  • Model-specific libraries (open-clip-torch, ultralytics, transformers, faiss)

πŸ› οΈ Development

Project Structure

image-analysis-toolkit/
β”œβ”€β”€ image_metadata_showcase/           # Metadata extraction
β”œβ”€β”€ indoor-outdoor-classifier/         # Scene classification
β”œβ”€β”€ search-images-with-text-or-image/  # Image search
β”œβ”€β”€ human-only-blur/                   # Privacy anonymizer
β”œβ”€β”€ image-similarity-suite/            # Similarity analysis
β”œβ”€β”€ object-level-image-search/         # Object search
β”œβ”€β”€ LICENSE                            # MIT License
└── README.md                          # This file

Best Practices

  • Always use virtual environments for isolation
  • Cache model weights to avoid re-downloading
  • Use GPU acceleration for production workloads
  • Batch process images when possible
  • Monitor memory usage for large datasets

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


🀝 Contributing

Contributions are welcome! Areas for improvement:

  • Additional scene categories
  • New anonymization modes
  • Performance optimizations
  • Additional similarity metrics
  • Better error handling
  • Documentation improvements

πŸ“§ Contact

Madhavendranath S πŸ“§ Email: madhavendranaths@gmail.com πŸ”— GitHub: @Madhav-000-s


🌟 Acknowledgments

Built with:

Pretrained weights from:

  • LAION-2B dataset
  • COCO dataset (YOLOv8)

About

AI-powered computer vision toolkit for image analysis, search, classification, and privacy protection. Features CLIP-based semantic search, YOLOv8 anonymization, scene classification, and metadata extraction

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors