Robust Speaking Rate Estimation Using Broad Phonetic Class Recognition

Yuan, Jiahong; Liberman, Mark

Robust Speaking Rate Estimation Using Broad Phonetic Class Recognition

Files

ROBUST_SPEAKING_RATE_ESTIMATION.pdf (214.82 KB)

Penn collection

Departmental Papers (CIS)

Subject

Speaking rate estimation
syllable detection
robustness
broad phonetic class
Computer Sciences
Linguistics

Permalink

https://repository.upenn.edu/handle/20.500.14332/6496

View all metadata

Author

Yuan, Jiahong

Liberman, Mark

Abstract

Robust speaking rate estimation can be useful in automatic speech recognition and speaker identification, and accurate, automatic measures of speaking rate are also relevant for research in linguistics, psychology, and social sciences. In this study we built a broad phonetic class recognizer for speaking rate estimation. We tested the recognizer on a variety of data sets, including laboratory speech, telephone conversations, foreign accented speech, and speech in different languages, and we found that the recognizer’s estimates are robust under these sources of variation. We also found that the acoustic models of the broad phonetic classes are more robust than those of the monophones for syllable detection.

Date of presentation

2010-03-01

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-17T05:27:10.000

Comments

Suggested Citation: Yuan, J. and M. Libman. (2010). Robust Speaking Rate Estimation Using Broad Phonetic Class Recognition. 2010 International Conference on Acoustics, Speech and Signal Processing. Dallas, Texas. March 14-19, 2010. ©2010 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Collection

Presentations