AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION
ANSWERING SYSTEM

Popli, Labhesh

Keyword Search

School Logo

Labhesh thesis 1.9.20.pdf (2 MB)

AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM

Author Info

Popli, Labhesh

Permalink:

http://rave.ohiolink.edu/etdc/view?acc_num=csu1579015180507068

Year and Degree

2019, Master of Science in Software Engineering, Cleveland State University, Washkewicz College of Engineering.

Abstract

With advances of internet computing and a great success of social media websites, internet is exploded with a huge number of digital images. Nowadays searching appropriate images directly through search engines and the web is trending. However, automatically finding images relevant to a textual query content remains a very challenging task. Visual Question Answering (VQA) system has emerged as a significant multidisciplinary research problem. The research combines methodologies from the different areas like natural language processing, image recognition and knowledge representation. The main challenges for developing such a VQA system is to deal with the scalability of the solution and handling features of the objects in vision and questions in a natural language simultaneously. Prior works have been done to develop models for VQA by extracting and combining image features using Convolution Neural Network (CNN) and textual features using Recurrent Neural Network (RNN). This thesis explores methodologies to build a Visual Question Answering (VQA) system that can automatically identify and answer a question about the image presented to it. The VQA system uses methods of deep Residual Network (ResNet), an advanced Convolution Neural Network (CNN) model for image identification, and Long Short-Term Memory (LSTM) networks, which is advanced form of Recurring Neural Network (RNN) for Natural Language Processing (NLP) to analyze a user-provided question. Finally, the features from both an image and a user question are combined to indicate an attention area to focus on to identify objects in the area of the image in deep residual network, to produce an answer in text. When evaluated on the well-known challenging COCO data set and VQA 1.0 dataset, this system has produced an accuracy of 59%, with a 12% increase when compared with a baseline model without the attention-based technique and the results also show comparable performance to other existing state-of-the-art attention-based approaches in the literature. The quality and the accuracy of the method used in this research are compared and analyzed.

Committee

Sunnie Chung (Advisor)
Wenbing Zhao (Committee Member)
Yongjian Fu (Committee Member)

Subject Headings

Artificial Intelligence; Computer Engineering; Computer Science; Scientific Imaging

Popli, L. (2019). AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM [Master's thesis, Cleveland State University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=csu1579015180507068
APA Style (7th edition)
Popli, Labhesh. AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM. 2019. Cleveland State University, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=csu1579015180507068.
MLA Style (8th edition)
Popli, Labhesh. "AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM." Master's thesis, Cleveland State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=csu1579015180507068
Chicago Manual of Style (17th edition)

Document number:

csu1579015180507068

Download Count:

633

Copyright Info

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM

Abstract Details

Recommended Citations

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Global Search Box

Files

File List

ETD Abstract Container

Abstract Header

AN ATTENTION BASED DEEP NEURAL NETWORK FOR VISUAL QUESTION ANSWERING SYSTEM

Abstract Details

Recommended CitationsRefworksEndNoteRISMendeley

Citations

Abstract Footer

Global Footer

Ohio Department of Higher Education

State Government Links

Education Links

Recommended Citations