The place where stuff Happens

An Approach to Speech Recognition based on Facial Electromyography

In Today communication is not only talking and listening to someone, we talk to machines to help us navigate through the streets, we can ask our smart phones to search for the next bus stop and much more by using Automatic Speech Synthesis or ASRs which are now a custom in our everyday live, we use them from game control in consoles to app finders in mobile phones. The problem with current ASRs is that they are not resistant to noise and cannot be used in any other way except for auditory use by using a microphone forcing people with speech impairment opting to use other type of communication.

But with the introduction of EMG or electromyography, one can speak with the help EMG-ASRs using facial muscle movement. EMG has been around the 1960 (Garrity 1977) and is used in several enviroments from Medical Analysis to Sports. EMG is used to capture muscle movement even those which are very small by capturing MyoElectical Signals for any part of the body.

Several attempts on EMG based speech recognition systems where conducted but the problem with such systems is that they are only available for small short commands and not for fluent talk. Fluent talk needs a great basis to work on, a structure to builds on like phoneme which are small vocal noises we use to generate a word.

Current systems generally use Hidden Markov Model (HMM), to decode words or sentences or even phonemes and can easily learn to adapt to the new words. Another way to capture sound is by using K-Nearest Neighbour (KNN) in patter recognition that can be adapted to recognize almost anything and so can be adapted to recognize a set of sample retrieved from EMG devices.

Both HMM and KNN are interesting algorithms although not used for the same reasons, they can adapt to almost any problem, that’s why trying to see whether KNN is good enough to be used for speech recognition alongside with HMM.

EMG technologies are quite interesting to know that a muscle has moved using a machine and able to use that data and manipulate with machines with the data.The benefits of these EMG based is that noise is not captured and therefore can be used in high noise polluted enviroments such as constuction sites, prio technics, music events. It can also be used to help speech impaired individuals who are able to move their facial muscles. 

blog comments powered by Disqus