Skip to Main Content
 

Global Search Box

 
 
 
 

ETD Abstract Container

Abstract Header

Multimodal interface integrating eye gaze tracking and speech recognition

Mahajan, Onkar

Abstract Details

2015, Master of Science, University of Toledo, Engineering (Computer Science).
Currently, the most common method of interacting with a computer is through the use of a mouse and keyboard. HCI research includes the development of interactive interfaces that go beyond the desktop Graphical User Interface (GUI) paradigm. The provision of user-computer interfaces through gesturing, facial expression, speaking as well as other forms of human communication have also been the focus of intense studies. Eye Gaze Tracking (EGT) is another type of Human Computer Interface which has proven useful for several different industries, and the rapid introduction of new models by commercial EGT companies has led to more efficient and user-friendly interfaces. Unfortunately, the cost of these commercial trackers have made it difficult for them to gain popularity. In this research, a low cost multi-modal interface is utilized to overcome this issue and help users adapt to new input modalities. The system developed recognizes input from eyes and speech. The eye gaze detection module is based on Opengazer, an open-source gaze tracking application, and is responsible for determining the estimated gaze point coordinates. The images captured during calibration are grey-scaled and averaged to form a single image; they are mapped relative to the position of the user’s pupil and the corresponding point on the screen. These images are then used to train a Gaussian Process which is used to determine the estimated gaze point. The voice recognition module detects voice commands from the user and converts them into mouse events. This interface can be operated in two distinct modes. One mode uses eye gaze as a cursor-positioning tool and voice commands to perform mouse click events. The second mode uses dwell-based gaze interaction, in which focusing for a predetermined amount of time triggers a click event. Both the modules work concurrently when using multimodal input. Several modifications were made to improve the stability and accuracy of gaze, albeit within the constraints of the open-source gaze tracker. The multimodal implementation results were measured in terms of tracking accuracy and stability of estimated gaze point.
Jackson Carvalho, PhD (Committee Chair)
Mansoor Alam, PhD (Committee Member)
Henry Ledgard, PhD (Committee Member)
76 p.

Recommended Citations

Citations

  • Mahajan, O. (2015). Multimodal interface integrating eye gaze tracking and speech recognition [Master's thesis, University of Toledo]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171

    APA Style (7th edition)

  • Mahajan, Onkar. Multimodal interface integrating eye gaze tracking and speech recognition. 2015. University of Toledo, Master's thesis. OhioLINK Electronic Theses and Dissertations Center, http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171.

    MLA Style (8th edition)

  • Mahajan, Onkar. "Multimodal interface integrating eye gaze tracking and speech recognition." Master's thesis, University of Toledo, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1430494171

    Chicago Manual of Style (17th edition)