Augmenting Telephony Audio Data using Robust Principal Component Analysis
Mo, Ronald K. and Lam, Albert Y.S. (2020) Augmenting Telephony Audio Data using Robust Principal Component Analysis. In: IEEE Symposium Series on Computational Intelligence, Dec 1, 2020 - Dec 4, 2020, Canberra, ACT, Australia.
Item Type: | Conference or Workshop Item (Paper) |
---|
Abstract
Audio augmentation (e.g., corrupting audio data by noise) has been shown to improve the performance of Automatic Speech Recognition (ASR) systems for low-resource languages. In light of this, we are interested in understanding whether corrupting speech data with telephone channel characteristics (e.g., background music, artifact caused by down-sampling) improves the performance of ASR systems as well. In this work, we investigate the possibility of applying Sound Source Separation (SSS) approaches to capture the telephone channel characteristics. We are in particular interested in Robust Principal Component Analysis (RPCA), which is an unsupervised approach used for various SSS tasks. Our results show that augmenting clean speech data corpus with telephone channel characteristics yields a more robust ASR system, with 7.8% of Word Error Rate reduction. We also find that the characteristic, which has the lowest spectral features, improves ASR the most.
More Information
Depositing User: Ronald Mo |
Identifiers
Item ID: 15847 |
Identification Number: https://doi.org/10.1109/SSCI47803.2020.9308406 |
URI: http://sure.sunderland.ac.uk/id/eprint/15847 | Official URL: http://dx.doi.org/10.1109/SSCI47803.2020.9308406 |
Users with ORCIDS
Catalogue record
Date Deposited: 22 Mar 2023 16:02 |
Last Modified: 22 Mar 2023 16:02 |
Author: | Ronald K. Mo |
Author: | Albert Y.S. Lam |
University Divisions
Faculty of Technology > School of Computer ScienceSubjects
Computing > Artificial IntelligenceComputing
Actions (login required)
View Item (Repository Staff Only) |