In this thesis we propose the paradigm for video-based interaction in virtual training environment; explore and develop image recognition techniques to estimate the position of user’s head and hands; and use this data to control virtual 3D avatar.
Moreover, we target to develop a cost efficient system using only cross-platform and freely available software components, establishing interaction via common hardware (computer, monocular web-cam), and an ordinary internet channel. Consequently we aim increasing accessibility and cost efficiency of the system and avoidance of expensive instruments for essential interaction in virtual space.
The results of this work are the following: the method for estimation the hand positions; the proposed design solutions for the system; the proof of concepts based on the two test cases (“Ball game” and “Chemistry lab”); and the discussion regarding the advantages, limitations and possible future research on the video-based interactions in the virtual environments.