Improving audio anomalies recognition using temporal convolutional attention networks

Huang, Qiang (2021) Improving audio anomalies recognition using temporal convolutional attention networks. In: ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp. 6473-6477. ISBN 978-1-7281-7605-5

Item Type:	Book Section

Abstract

Anomalous audio in speech recordings is often caused by speaker voice distortion, external noise, or even electric interferences. These obstacles have become a serious problem in some fields, such as recording high-quality music and speech processing. In this paper, a novel approach using a temporal convolutional attention network (TCAN) is proposed to tackle this problem. The use of temporal conventional network (TCN) can capture long range patterns using a hierarchy of temporal convolutional filters. To enhance the ability to tackle audio anomalies in different acoustic conditions, an attention mechanism is used in TCN, where a self-attention block is added after each temporal convolutional layer. This aims to highlight the target related features and mitigate the interferences from irrelevant information. To evaluate the
performance of the proposed model, audio recordings are collected from the TIMIT dataset, and are then changed by adding five different types of audio distortions: gaussian noise, magnitude drift, random dropout, reduction of temporal resolution, and time warping. Distortions are mixed at different signal-to-noise ratios (SNRs) (5dB, 10dB, 15dB, 20dB, 25dB, 30dB). The experimental results show that the use of proposed model can yield good classification performances and outperforms some strong baseline methods, such as the LSTM and TCN based models, by about 3∼ 10% relatively.

Full text not available from this repository.

More Information

Additional Information: Conference Proceedings: 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada

Related URLs: https://ieeexplore.ieee.org/xpl/conhome/...

Depositing User: Qiang Huang

Identifiers

Item ID: 18272

Identification Number: https://doi.org/10.1109/ICASSP39728.2021.9414611

ISBN: 978-1-7281-7605-5

URI: http://sure.sunderland.ac.uk/id/eprint/18272

Official URL: https://ieeexplore.ieee.org/document/9414611

Users with ORCIDS

ORCID for Qiang Huang:

orcid.org/0000-0002-2943-2283

Catalogue record

Date Deposited: 24 Sep 2024 14:01

Last Modified: 04 Jun 2025 15:23

Contributors

Author:

Qiang Huang

University Divisions

Faculty of Business and Technology

Subjects

Computing > Artificial Intelligence

Actions (login required)

View Item (Repository Staff Only)

SURE

Improving audio anomalies recognition using temporal convolutional attention networks

Abstract

More Information

Identifiers

Users with ORCIDS

Catalogue record

Contributors

University Divisions

Subjects

Actions (login required)

Export Record

Export Record

Export Record

Altmetric

Altmetric

Altmetric