Skip to Main Content
 

Global Search Box

 
 
 
 

Files

ETD Abstract Container

Abstract Header

A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Description

Nina, Oliver A, Nina

Abstract Details

2018, Doctor of Philosophy, Ohio State University, Electrical and Computer Engineering.
Learning visual feature representations for video analysis is non-trivial and requires a large amount of training samples and a proper generalization framework. Many of the current state of the art methods for video captioning and movie description rely on simple encoding mechanisms through recurrent neural networks to encode temporal visual information extracted from video data. We introduce a novel multitask encoder-n-decoder framework for automatic semantic description and captioning of video sequences. In contrast to current approaches, at training time our method relies on multiple distinct decoders to train a visual encoder in a multitask fashion. Our method shows improved performance over current SotA methods in several metrics on both multi-caption and single-caption datasets. Our method is the first method to use a multi-task approach for encoding video features. Furthermore, based on human subject evaluations, our method was ranked as the most helpful algorithm for the visually impaired finishing first place at the Large Scale Movie Description Challenge (LSMDC) in the movie captioning task in conjunction with the International Conference in Computer Vision (ICCV) 2017. Our method won the competition task among other top participating research groups worldwide and is currently the state of the art on automatic commercial movie description.
Alper Yilmaz (Committee Member)
Rongjun Qin (Committee Member)
Kevin Passino (Committee Member)
128 p.

Recommended Citations

Citations

  • Nina, Nina, O. A. (2018). A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Description [Doctoral dissertation, Ohio State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531996548147165

    APA Style (7th edition)

  • Nina, Nina, Oliver. A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Description. 2018. Ohio State University, Doctoral dissertation. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=osu1531996548147165.

    MLA Style (8th edition)

  • Nina, Nina, Oliver. "A Multitask Learning Encoder-N-Decoder Framework for Movie and Video Description." Doctoral dissertation, Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1531996548147165

    Chicago Manual of Style (17th edition)