Skip to content

sobitsingh/Exploiting-Expert-Knowledge-IN-DL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploiting-Expert-Knowledge-IN-DL

Image Captioning with Keras and TensorFlow

Image captioning is a technology that combines LSTM text generation with the computer vision powers of a convolutional neural network. In this part, I used LSTM and CNN to create a image captioning system. I used transfer learning to utilize these two projects:

InceptionV3
Glove Embeddings
I used inception to extract features from the images and glove embeddings as a set of Natural Language Processing (NLP) vectors for common words.

Dataset

SemArt Dataset can be downloaded using the following link. You will need to place all the images in a folder named 'images' in the same directory. You also need to create a data folder for storing all the features of trained and test images of semart dataset. You could also download these features using these links-
data/train
data/test
Glove emebeddings file can be downloaded from here

Model

In this example, I used Glove for the text embedding and InceptionV3 to extract features from the images. Both of these transfers serve to extract features from the raw text and the images. InceptionV3 has 2,048 features below the classifier, and MobileNet has over 50K. If the additional dimensions truly capture aspects of the images, then they are worthwhile. However, having 50K features increases the processing needed and the complexity of the neural network we are constructing.The embeddings were then passed into LSTM after which the image and text features were combined and sent to a decoder network to generate the next word..

Results

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published