📏 Visual Estimation of Real-World Product Dimensions

Estimate the physical height & width of a product — just from an user taken image.

🎯 Research Objective

Can we predict a product’s real-world dimensions without any measurement input — using only user-taken photos?

This project explores the limits of what kind of physical insight can be extracted from visual-only signals, especially in noisy, real-world e-commerce scenarios.

🧹 Data Collection & Cleaning

Scraped 24,000+ user-uploaded images from an e-commerce platform.
Covered 520 product categories, ~60 images per product.
Images were noisy and inconsistent, as expected from user content.

✅ A custom visual filtering pipeline was implemented to automatically retain only clean product views, replacing the need for manual curation.

🔍 Object Detection: Finding the Product

Bounding box annotations were not available, so I experimented with several zero-shot / low-shot object detection approaches:

VGG16-based Visual Outlier Detection
YOLOv8
CLIP + SAM
✅ GroundingDINO (selected: best performance)

This allowed the pipeline to isolate only the product in each photo, which significantly improved downstream predictions.

📐 Feature Extraction

Beyond raw image input, each segmented product was used to compute 12 visual-statistical features, including:

Aspect ratio
Normalized area
Rectangularity
Center offset
Foreground-background contrast
and more

These structured features act as a helpful inductive bias alongside image-based learning.

🧠 Modeling: Dimension Regression with Deep Learning

Input: Cropped product image + 12D feature vector
Output: Real-world height & width (float regression)

🏗️ Models Evaluated:

ResNet50
EfficientNetB3
ConvNeXt
✅ Swin Transformer (best performer)

After identifying Swin as the top model, I applied a two-phase training strategy:

Frozen backbone: Only the regression head was trained initially.
Unfrozen fine-tuning: Full model was then fine-tuned end-to-end.

This improved stability and reduced overfitting in early training.

🔧 Why PyTorch?

Native support for research-centric models: CLIP, SAM, GroundingDINO
Dynamic computation graphs
Easier experimentation and debugging
Rapid prototyping with Hugging Face, timm, and segmentation libraries

🚀 Outcomes & Use Cases

This pipeline combines object detection, visual feature engineering, and deep regression to estimate real-world product sizes from visual signals only.

Potential Applications:

🛍️ E-commerce auto-tagging (dimensions, volume, proportions)
📦 Packaging optimization (logistics, shipping cost estimation)
📱 Mobile apps (dimension from photo, DIY tools)
🧾 Metadata generation for large-scale product databases

🔗 Try It Yourself

🧪 Live Demo (Hugging Face Space):
👉 Launch Demo

📘 Full Notebook (Kaggle):
📎 View on Kaggle

💻 Source Code (GitHub):
💾 GitHub Repository

📣 Contact

Open to collaboration and feedback!
Feel free to reach out via GitHub or connect on LinkedIn.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.github/workflows		.github/workflows
.ipynb_checkpoints		.ipynb_checkpoints
csv files created during the data collection phase		csv files created during the data collection phase
dimension_images		dimension_images
GroundingDINO_SwinT_OGC.py		GroundingDINO_SwinT_OGC.py
README.md		README.md
Scrape.ipynb		Scrape.ipynb
To_upload_to_kaggle.ipynb		To_upload_to_kaggle.ipynb
app.py		app.py
clear_and_download_images.ipynb		clear_and_download_images.ipynb
df_box(cropped_image_info).csv		df_box(cropped_image_info).csv
finaldata.csv		finaldata.csv
kernel-metadata.json		kernel-metadata.json
measuring-size-from-photos.ipynb		measuring-size-from-photos.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📏 Visual Estimation of Real-World Product Dimensions

🎯 Research Objective

🧹 Data Collection & Cleaning

🔍 Object Detection: Finding the Product

📐 Feature Extraction

🧠 Modeling: Dimension Regression with Deep Learning

🏗️ Models Evaluated:

🔧 Why PyTorch?

🚀 Outcomes & Use Cases

Potential Applications:

🔗 Try It Yourself

📣 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📏 Visual Estimation of Real-World Product Dimensions

🎯 Research Objective

🧹 Data Collection & Cleaning

🔍 Object Detection: Finding the Product

📐 Feature Extraction

🧠 Modeling: Dimension Regression with Deep Learning

🏗️ Models Evaluated:

🔧 Why PyTorch?

🚀 Outcomes & Use Cases

Potential Applications:

🔗 Try It Yourself

📣 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages