Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)

international conference on learning representations, 2015.

Cited by: 842|Bibtex|Views67|Links
EI

Abstract:

In this paper, we present a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. It directly models the probability distribution of generating a word given previous words and an image. Image captions are generated by sampling from this distribution. The model consists of two sub-networks: a deep recurre...More

Code:

Data:

Your rating :
0

 

Tags
Comments