Skip to Main Content
Frequently Asked Questions
Submit an ETD
Global Search Box
Need Help?
Keyword Search
Participating Institutions
Advanced Search
School Logo
Files
File List
Final_Thesis.pdf (806.32 KB)
ETD Abstract Container
Abstract Header
Multimodal interface integrating eye gaze tracking and speech recognition
Author Info
Mahajan, Onkar
Permalink:
http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171
Abstract Details
Year and Degree
2015, Master of Science, University of Toledo, Engineering (Computer Science).
Abstract
Currently, the most common method of interacting with a computer is through the use of a mouse and keyboard. HCI research includes the development of interactive interfaces that go beyond the desktop Graphical User Interface (GUI) paradigm. The provision of user-computer interfaces through gesturing, facial expression, speaking as well as other forms of human communication have also been the focus of intense studies. Eye Gaze Tracking (EGT) is another type of Human Computer Interface which has proven useful for several different industries, and the rapid introduction of new models by commercial EGT companies has led to more efficient and user-friendly interfaces. Unfortunately, the cost of these commercial trackers have made it difficult for them to gain popularity. In this research, a low cost multi-modal interface is utilized to overcome this issue and help users adapt to new input modalities. The system developed recognizes input from eyes and speech. The eye gaze detection module is based on Opengazer, an open-source gaze tracking application, and is responsible for determining the estimated gaze point coordinates. The images captured during calibration are grey-scaled and averaged to form a single image; they are mapped relative to the position of the user’s pupil and the corresponding point on the screen. These images are then used to train a Gaussian Process which is used to determine the estimated gaze point. The voice recognition module detects voice commands from the user and converts them into mouse events. This interface can be operated in two distinct modes. One mode uses eye gaze as a cursor-positioning tool and voice commands to perform mouse click events. The second mode uses dwell-based gaze interaction, in which focusing for a predetermined amount of time triggers a click event. Both the modules work concurrently when using multimodal input. Several modifications were made to improve the stability and accuracy of gaze, albeit within the constraints of the open-source gaze tracker. The multimodal implementation results were measured in terms of tracking accuracy and stability of estimated gaze point.
Committee
Jackson Carvalho, PhD (Committee Chair)
Mansoor Alam, PhD (Committee Member)
Henry Ledgard, PhD (Committee Member)
Pages
76 p.
Subject Headings
Computer Science
;
Engineering
Keywords
Eye Gaze Tracking, Speech Recognition, Multimodal Interface
Recommended Citations
Refworks
EndNote
RIS
Mendeley
Citations
Mahajan, O. (2015).
Multimodal interface integrating eye gaze tracking and speech recognition
[Master's thesis, University of Toledo]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171
APA Style (7th edition)
Mahajan, Onkar.
Multimodal interface integrating eye gaze tracking and speech recognition.
2015. University of Toledo, Master's thesis.
OhioLINK Electronic Theses and Dissertations Center
, http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171.
MLA Style (8th edition)
Mahajan, Onkar. "Multimodal interface integrating eye gaze tracking and speech recognition." Master's thesis, University of Toledo, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171
Chicago Manual of Style (17th edition)
Abstract Footer
Document number:
toledo1430494171
Download Count:
1,491
Copyright Info
© 2015, some rights reserved.
Multimodal interface integrating eye gaze tracking and speech recognition by Onkar Mahajan is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License. Based on a work at etd.ohiolink.edu.
This open access ETD is published by University of Toledo and OhioLINK.