Reconnaissance audiovisuelle de la parole par VMike
Conférence : Communications avec actes dans un congrès international
This article presents a new Electronic Retina based Smart
Microphone (VMike) and investigates the use of its novel
parameters – lip profiles – in audiovisual speech recognition. In order to evaluate the parameterization, both an audio only and a video only speech recognition system are
developed and tested. Then, two main fusion techniques
are employed to test the usability of profiles in audiovisual
systems : feature fusion and decision fusion. These results
are compared to the performance of recognizers based on
a state-of-the-art parameterization, and also to results obtained by applying perceptual filtering to the speech signal prior to recognition. When feature fusion is applied,
and under noisy conditions, recognition using lip profiles
improved by up to 13 percent with respect to audio-only
recognition.