
Speech-in-Noise Comprehension with DNN-Generated Talking Face
Academic Research
November 2022
Overview
This project investigates how a deep-neural-network (DNN) system that synthesizes video of a talking face can supplement an acoustic-only speech signal to improve comprehension in noisy environments. This human-computer interaction study demonstrates that DNN-generated visual speech cues significantly improve speech comprehension in noise, with direct relevance to hearing aid noise-reduction feature evaluation.
Key Findings
- Designed and evaluated an experiment showing that DNN-synthesized visual speech cues significantly improve speech comprehension in noise.
- Measured improvement in human speech comprehension across different environmental noise levels and its interaction with neural network-generated visual cues.
- Results demonstrate the potential of AI-generated visual augmentation as a tool for hearing assistance.
Shan, T., Wenner, C.E., Xu, C., Duan, Z. & Maddox, R.K. Speech-in-noise comprehension is improved when viewing a deep-neural-network-generated talking face. Trends in Hearing, 26 (2022). https://doi.org/10.1177/23312165221136934