Skip to content

belovm96/image-captioning-with-LSTMs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Caption Generation With Conditioned LSTMs

In this group research project, I built a CNN+LSTM neural network that is based on the top-down approach and, as a member of the research group, experimented with two variations of encoder-decoder models - Merge and Inject. We were able to achieve great performance results by training our models on 90,000 image+reference caption examples from Google's Conceptual Captions dataset. Our model's performance approaches state-of-the-art results for the task of image captioning. Project report with a complete performance analysis can be found in the 'report' folder of this repository. Our team's repository for collaboration is here.

Project Recap

  • Created a data sourcing and preprocessing script - Madhavan
  • Performed EDA on the preprocessed image data - Mike
  • Built and trained 6 variations of the encoder-decoder CNN+LSTM model - Mike, Malavika, Madhavan
  • Analyzed the performance of each model using BLEU-4, METEOR and ROUGE-l scoring and provided the discussion on the final peformance results for each deep learning model - Mike, Madhavan, Malavika
  • Wrote a project report capturing the project's motivation, goals, methods and results - Mike, Madhavan, Malavika

Libraries & Tools

Programming Language: Python 3.7

Libraries: Keras, TensorFlow, NLTK, OpenCV, NumPy, Matplotlib, Requests, Concurrent

Project teammates:

Madhavan Seshadri and Malavika Srikanth

Credit to:

Professor Daniel Bauer

About

Implementations of Encoder-Decoder Approaches for Image Caption Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors