The Distribution Of Disfluencies In Spontaneous Speech: Empirical Observations And Theoretical Implications

Zhang, Hong

The Distribution Of Disfluencies In Spontaneous Speech: Empirical Observations And Theoretical Implications

Files

Zhang_upenngdas_0175C_14433.pdf (5.1 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Linguistics

Subject

Linguistics

Copyright date

2021-08-31T20:20:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/31032

View all metadata

Author

Zhang, Hong

Abstract

This dissertation provides an empirical description of the forms and their distribution of disfluencies in spontaneous speech. Although research in this area has received much attention in past four decades, large scale analyses of speech corpora from multiple communication settings, languages, and speaker's cognitive states are still lacking. Understandings of regularities of different kinds of disfluencies based on large speech samples across multiple domains are essential for both theoretical and applied purposes. As an attempt to fill this gap, this dissertation takes the approach of quantitative analysis of large corpora of spontaneous speech. The selected corpora reflect a diverse range of tasks and languages. The dissertation re-examines speech disfluency phenomena, including silent pauses, filled pauses (um" and uh") and repetitions, and provides the empirical basis for future work in both theoretical and applied settings. Results from the study of silent and filled pauses indicate that a potential sociolinguistic variation can in fact be explained from the perspective of the speech planning process. The descriptive analysis of repetitions has identified a new form of repetitive phenomenon: repetitive interpolation. Both the acoustic and textual properties of repetitive interpolation have been documented through rigorous quantitative analysis. The defining features of this phenomenon can be further used in designing speech based applications such as speaker state detection. Although the goal of this descriptive analysis is not to formulate and test specific hypothesis about speech production, potential directions for future research in speech production models are proposed and evaluated. The quantitative methods employed throughout this dissertation can also be further developed into interpretable features in machine learning systems that require automatic processing of spontaneous speech.

Advisor

Mark Y. Liberman

Date of degree

2020-01-01

Collection

Dissertations and Theses