The Online Adjustment Of Speaker-Specific Phonetic Beliefs In Multi-Speaker Speech Perception

Wei Lai, University of Pennsylvania


This dissertation examines how listeners' knowledge of interspeaker variability guides their generalization of perceptual learning in multi-talker listening. A series of perceptual learning experiments are conducted to evaluate whether listeners generalize what they have learned about a previous talker's production of sibilants and stop VOT to another speaker either of the same gender or a different gender. Experiment 1 and 2 finds that the perceptual learning of sibilants constantly generalizes across speakers of different genders under an acoustics-phonology mismatch constraint. The constraint states that perceptual learning fails to generalize if there is a mismatch between the directions of perceptual shifts intended by the raw acoustic distributions of stimuli and by their phonological distribution in the perceptual space. Experiment 3 reports evidence for the perceptual generalization of stop VOT across speakers of different genders. These results lend support to a cumulative update account, which suggests that perceptual learning updates across speakers in such a way where previous and current perceptual learning experiences are re-integrated to form a cumulative perceptual expectation that listeners use for upcoming perception events. Building on the above findings, Experiment 4 investigates the constraints of speaker identity and gender on the perceptual generalization of sibilants and stops by introducing and manipulating visual identity and voice gender cues. The results show reduced magnitude for perceptual generalization across genders than within gender, and, in the latter case, for perceptual generalization across speakers than within speaker. These results raise the possibility that socioindexical specificity imposes a constraint on perceptual learning by modulating the magnitude of perceptual generalization across social groups, instead of blocking its occurrence. They also suggest that listeners' knowledge of structure in talker variability may be more fine-grained than hard-and-fast bindings of social-demographic groups and lend support to the sophisticated interweaving of social information in the architecture of the phonetics-phonological mapping system.