New Era for Robust Speech Recognition by Shinji Watanabe Marc Delcroix Florian Metze & John R. Hershey

New Era for Robust Speech Recognition by Shinji Watanabe Marc Delcroix Florian Metze & John R. Hershey

Author:Shinji Watanabe, Marc Delcroix, Florian Metze & John R. Hershey
Language: eng
Format: epub
Publisher: Springer International Publishing, Cham


9.1.2.2 Feature Normalisation

The feature normalisation techniques, on the other hand, often treat the DNN as a black box and leverage independent feature-processing techniques to suppress the mismatch problem. This allows existing feature enhancement and normalisation techniques to be used for DNN adaptation. For example, a global constrained maximum likelihood linear regression (CMLLR) [11] transform estimated from a separate GMM/HMM system has been found to be very effective in reducing speaker variability in the acoustic features to improve the DNN-based ASR performance [58]. Furthermore, feature-based discriminative linear regression (fDLR) [58, 87], which discriminatively estimates a CMLLR-like affine transformation, has also been successfully applied to unsupervised speaker adaptation of large-scale DNN systems . Feature-based vector Taylor series (VTS) [44], which use a Gaussian mixture model to perform a nonlinear mapping for estimating clean acoustic features from noisy acoustic features, has also been successfully applied to improve the noise robustness of DNN acoustic models [35] . Besides, with stereo data, advanced feature normalisation techniques based on deep learning, such as the denoising autoencoder [25, 40] and DNN-based speech enhancement [79, 80] , have also been successfully applied to noisy speech recognition.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.