University of Pennsylvania Working Papers in Linguistics


Michael Newman


Prior determination of the vowels instantiated by tokens is needed to be able to reliably to plot those vowels. Consequently, investigation of variation and change involving the New York City English (NYCE) low-back vowel system encounters an obstacle in the difficulty in assigning some words to the LOT versus PALM classes given reports of interspeaker variation between them (see e.g., Labov, Ash and Boberg 2006) and their proximity in vowel space. It is sometimes unclear which vowel class tokens should be assigned to. This study addresses that problem by the employment of hierarchical cluster analysis (HCA) to propose the needed token vowel assignments. This statistical technique was applied to the low back vowels derived from a read-aloud task given to eleven White New Yorkers, the group for whom the greatest variation has been reported. HCA appears ideally suited to the task because it groups items by similarity defined in terms of proximity in Euclidean distance just as in a vowel chart. When applied to this data set, HCAs provided convincing distributions of tokens for all participants. Two were shown as having intact 3-vowel systems of LOT, PALM and THOUGHT (called 3-D) with all words distributed as expected from Kaye’s (2012) criteria for traditional NYCE LOT-PALM assignments; three presented weakened 3-D systems, in which some Kaye-defined PALM words had defected to LOT; and six had two-vowel systems (called MAIN) with LOT and PALM fully merged. THOUGHT remained separate for all participants. Furthermore, the age distribution of participants with intact 3-D, weakened 3-D, and MAIN systems tended to follow in that order from older to younger. Consequently, the results suggest that NYCE low-back variation is symptomatic of merger by transfer (Trudgill and Foxcroft 1978) of words from PALM to LOT. In sum, the findings provide evidence not only that this form of merger is taking place in NYCE (which lags in a larger regional pattern of low-back simplification). They also suggest that cluster analysis can be a useful tool in determining token distribution in other cases in which variant assignment is uncertain, particularly although potentially not only vowels.