50.035-CV-Project

50.035 Computer Vision Project

Sketch Image Classification with Shape-bias Computer Vision

Background

With the growing popularity of touch-interface devices, more people have begun using simple sketches to communicate emotions and ideas. Sketches are more simplistic than actual photos as they are often abstractions of complex objects in real life. Sketches emphasise the overall shape language of objects, which makes it a suitable subject in trying to reduce texture-bias and focus more on the shape-bias. An image classifier able to detect the subject(s) of a hand-drawn image has many wide applications in today’s world, such as communication aids, novel dataset generation or even games.

Hand-drawn sketch recognition remains a difficult task, owing to the sketches' extremely abstract and symbolic features. Furthermore, with individual variance in skill, the same object may have vastly different shapes and degrees of abstraction. The subject of drawings may also break free from the realm of reality, depicting fantastical concepts such as magic, science fiction, and monsters. This poses an interesting challenge as compared to conventional image classification, which generally seeks to identify objects from our everyday environments.

Introduction

This project aims to detect the subject of various hand-drawn images and classify them using a deep neural network approach. There have been multiple studies regarding sketch image classification, detailing various methods and neural network architectures. In this project, we will focus on testing and comparing the methods outlined in The Origins and Prevalence of Texture Bias in Convolutional Neural Networks, Hermann et. al

The paper suggests that many CNNs tend to classify images based on texture information rather than shape, a texture-based approach as opposed to the shape-based approach that resonates to how humans identify images. This makes the effect of data augmentation on images much larger as they affect the texture and shape biases which determine how a neural network identifies features of an image. The paper proposes training models that can classify ambiguous images by shape by taking less aggressive random crops during training and applying simple, naturalistic augmentations such as distort ion of colour and blurring.

In our implementation, we will use a combination of open source sketch datasets such as the ImageNet dataset, as well as our own hand-drawn images. Links to the various datasets we may use can be found below:

Dependencies

For the models:

python==3.6 and above

pytorch==1.10.1

opencv-python==3.4.17

For downloading the datasets:

gsutil

Downloading the Datasets and Directory Structure

For this project, since the datasets are large, please download the datasets into your project directory.

Setting up the Directory Structure

For standardization, please ensure that your directory structure is as follows:

50.039-CV-PROJECT
    Datasets
        Cybertron
        ImageNet-Sketch
        Quick-Draw

Do take note that these datasets are quite large and will take a while to download.

Downloading ImageNet-Sketch

Download the data set from this google drive or kaggle

Unzip the data and put it in the ImageNet-Sketch folder.

Downloading Cybertron Dataset

Download the png version of the dataset here

Unzip the file and put it in the Cybertron folder.

Downloading the Google Quick!Draw Dataset

You will need to install gsutil to get the dataset from Google. If using a conda environment, run conda install -c conda-forge gsutil

To install, use the following commands:

cd Datasets
cd Quick-Draw
gsutil -m cp "gs://quickdraw_dataset/full/numpy_bitmap/*.npy" .

Do make sure to cd into the Quick-Draw folder first otherwise you're gonna have alot of objects in the Dataset folder (not fun).

Usage

Input photo size: 224×224

Useful links and resources

https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
Checkoffs		Checkoffs
Datasets/Custom		Datasets/Custom
Models		Models
References		References
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

50.035-CV-Project

Background

Introduction

Dependencies

Downloading the Datasets and Directory Structure

Setting up the Directory Structure

Downloading ImageNet-Sketch

Downloading Cybertron Dataset

Downloading the Google Quick!Draw Dataset

Usage

Useful links and resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

50.035-CV-Project

Background

Introduction

Dependencies

Downloading the Datasets and Directory Structure

Setting up the Directory Structure

Downloading ImageNet-Sketch

Downloading Cybertron Dataset

Downloading the Google Quick!Draw Dataset

Usage

Useful links and resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages