Guness, Shivanand Prabhoolall (2015) Development and Evaluation of Facial Gesture Recognition and Head Tracking for Assistive Technologies. Doctor of Philosophy (PhD) thesis, University of Kent,. (KAR id:60990)
PDF
Language: English |
|
Download this file (PDF/9MB) |
Preview |
Abstract
Globally, the World Health Organisation estimates that there are about 1 billion people suffering from disabilities and the UK has about 10 million people suffering from neurological disabilities in particular. In extreme cases these individuals with disabilities such as Motor Neuron Disease(MND), Cerebral Palsy(CP) and Multiple Sclerosis(MS) may only be able to perform limited head movement, move their eyes or make facial gestures. The aim of this research is to investigate low-cost and reliable assistive devices using automatic gesture recognition systems that will enable the most severely disabled user to access electronic assistive technologies and communication devices thus enabling them to communicate with friends and relative.
The research presented in this thesis is concerned with the detection of head movements, eye movements, and facial gestures, through the analysis of video and depth images. The proposed system, using web cameras or a RGB-D sensor coupled with computer vision and pattern recognition techniques, will have to be able to detect the movement of the user and calibrate it to facilitate communication. The system will also provide the user with the functionality of choosing the sensor to be used i.e. the web camera or the RGB-D sensor, and the interaction or switching mechanism i.e. eye blink or eyebrows movement to use. This ability to system to enable the user to select according to the user's needs would make it easier on the users as they would not have to learn how to operating the same system as their condition changes.
This research aims to explore in particular the use of depth data for head movement based assistive devices and the usability of different gesture modalities as switching mechanisms. The proposed framework consists of a facial feature detection module, a head tracking module and a gesture recognition module. Techniques such as Haar-Cascade and skin detection were used to detect facial features such as the face, eyes and nose. The depth data from the RGB-D sensor was used to segment the area nearest to the sensor. Both the head tracking module and the gesture recognition module rely on the facial feature module as it provided data such as the location of the facial features. The head tracking module uses the facial feature data to calculate the centroid of the face, the distance to the sensor, the location of the eyes and the nose to detect head motion and translate it into pointer movement. The gesture detection module uses features such as the location of the eyes, the location of the pupil, the size of the pupil and calculates the interocular distance for the detection of blink or eyebrows movement to perform a click action. The research resulted in the creation of four assistive devices based on the combination of the sensors (Web Camera and RGB-D sensor) and facial gestures (Blink and Eyebrows movement): Webcam-Blink, Webcam-Eyebrows, Kinect-Blink and Kinect-Eyebrows. Another outcome of this research has been the creation of an evaluation framework based on Fitts' Law with a modified multi-directional task including a central location and a dataset consisting of both colour images and depth data of people performing head movement towards different direction and performing gestures such as eye blink, eyebrows movement and mouth movements.
The devices have been tested with healthy participants. From the observed data, it was found that both Kinect-based devices have lower Movement Time and higher Index of Performance and Effective Throughput than the web camera-based devices thus showing that the introduction of the depth data has had a positive impact on the head tracking algorithm. The usability assessment survey, suggests that there is a significant difference in eye fatigue experienced by the participants; blink gesture was less tiring to the eye than eyebrows movement gesture. Also, the analysis of the gestures showed that the Index of Difficulty has a large effect on the error rates of the gesture detection and also that the smaller the Index of Difficulty the higher the error rate.
Item Type: | Thesis (Doctor of Philosophy (PhD)) |
---|---|
Thesis advisor: | Deravi, Farzin |
Thesis advisor: | Sirlantzis, Konstantinos |
Uncontrolled keywords: | Assistive Technology; AAC; Eye Blink; Eyebrows Movement; Fitts' Law |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Engineering and Digital Arts |
Depositing User: | Users 1 not found. |
Date Deposited: | 22 Mar 2017 12:00 UTC |
Last Modified: | 11 Dec 2022 12:13 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/60990 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):