Jose A. Gonzalez
Evaluation of a silent speech interface based on magnetic sensing and deep learning for a phonetically rich vocabulary
Gonzalez, Jose A.; Cheah, Lam A.; Green, Phil D.; Gilbert, James M.; Ell, Stephen R.; Moore, Roger K.; Holdsworth, Ed
Authors
Lam A. Cheah
Phil D. Green
Professor James Gilbert J.M.Gilbert@hull.ac.uk
Professor of Engineering
Stephen R. Ell
Roger K. Moore
Ed Holdsworth
Abstract
Copyright © 2017 ISCA. To help people who have lost their voice following total laryngectomy, we present a speech restoration system that produces audible speech from articulator movement. The speech articulators are monitored by sensing changes in magnetic field caused by movements of small magnets attached to the lips and tongue. Then, articulator movement is mapped to a sequence of speech parameter vectors using a transformation learned from simultaneous recordings of speech and articulatory data. In this work, this transformation is performed using a type of recurrent neural network (RNN) with fixed latency, which is suitable for realtime processing. The system is evaluated on a phoneticallyrich database with simultaneous recordings of speech and articulatory data made by non-impaired subjects. Experimental results show that our RNN-based mapping obtains more accurate speech reconstructions (evaluated using objective quality metrics and a listening test) than articulatory-to-acoustic mappings using Gaussian mixture models (GMMs) or deep neural networks (DNNs). Moreover, our fixed-latency RNN architecture provides comparable performance to an utterance-level batch mapping using bidirectional RNNs (BiRNNs).
Citation
Gonzalez, J. A., Cheah, L. A., Green, P. D., Gilbert, J. M., Ell, S. R., Moore, R. K., & Holdsworth, E. Evaluation of a silent speech interface based on magnetic sensing and deep learning for a phonetically rich vocabulary. Presented at Interspeech, Stockholm, Sweden
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | Interspeech |
Acceptance Date | Apr 1, 2016 |
Publication Date | Jan 1, 2017 |
Deposit Date | Apr 1, 2022 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Peer Reviewed | Peer Reviewed |
Volume | 2017-August |
Pages | 3986-3990 |
DOI | https://doi.org/10.21437/Interspeech.2017-802 |
Public URL | https://hull-repository.worktribe.com/output/3592107 |
You might also like
Monitoring of Curing Process of Epoxy Resin by Long-Period Fiber Gratings
(2024)
Journal Article
Nonlinear Modeling and Verification of a Heaving Point Absorber for Wave Energy Conversion
(2017)
Journal Article
Design of wideband vibration-based electromagnetic generator by means of dual-resonator
(2014)
Journal Article