H-VECTORS: Improving the robustness in utterance-level speaker embeddings using a hierarchical attention model
Shi, Yanpei, Huang, Qiang and Hain, Thomas (2021) H-VECTORS: Improving the robustness in utterance-level speaker embeddings using a hierarchical attention model. Neural Networks, 142. pp. 329-339. ISSN 0893-6080
Item Type: | Article |
---|
Abstract
In this paper, a hierarchical attention network is proposed to generate robust utterance-level embeddings (H-vectors) for speaker identification and verification. Since different parts of an utterance may have different contributions to speaker identities, the use of hierarchical structure aims to learn speaker related information locally and globally. In the proposed approach, frame-level encoder and attention are applied on segments of an input utterance and generate individual segment vectors. Then, segment level attention is applied on the segment vectors to construct an utterance representation. To evaluate the quality of the learned utterance-level speaker embeddings on speaker identification and verification, the proposed approach is tested on several benchmark datasets, such as the NIST SRE2008 Part1, the Switchboard Cellular (Part1), the CallHome American English Speech ,the Voxceleb1 and Voxceleb2 datasets. In comparison with some strong baselines, the obtained results show that the use of H-vectors can achieve better identification and verification performances in various acoustic conditions.
|
PDF
1-s2.0-S0893608021002203-main.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (1MB) | Preview |
More Information
Uncontrolled Keywords: Speaker embeddings, Hierarchical attention, Speaker identification, Speaker verification, Attention mechanism |
Depositing User: Qiang Huang |
Identifiers
Item ID: 16097 |
Identification Number: https://doi.org/10.1016/j.neunet.2021.05.024 |
ISSN: 0893-6080 |
URI: http://sure.sunderland.ac.uk/id/eprint/16097 | Official URL: https://www.sciencedirect.com/science/article/pii/... |
Users with ORCIDS
Catalogue record
Date Deposited: 22 May 2023 11:31 |
Last Modified: 11 Jul 2023 08:01 |
Author: | Yanpei Shi |
Author: | Qiang Huang |
Author: | Thomas Hain |
University Divisions
Faculty of Technology > School of Computer ScienceSubjects
Computing > Artificial IntelligenceComputing > Human-Computer Interaction
Actions (login required)
View Item (Repository Staff Only) |