Computationally Deriving Language-Internal Factors with Bipartite Networks
Penn collection
Degree type
Discipline
Subject
Funder
Grant number
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
Many sociolinguistic variables are constrained by the lexical semantics of an element in the linguistic environment surrounding the variable. One such variable is the English alternative embedded passive (AEP), also known as the ‘needs washed’ construction. The AEP has been primarily attested with three matrix verbs: need, want, and like. However, in contrast to the restricted set of attested matrix verbs, recent work has attested the AEP with a wider range of matrix verbs. Thus, there is a relatively basic set of research questions to be explored: what matrix verbs are permitted in the AEP, and what factors constrain this? These questions, however, belie two difficulties with semantic factors: the factor levels can be rather fuzzy and problematic, and it can be unclear what the factor levels of a semantic constraint should even be. This paper proposes that bipartite network modeling may be used to derive language-internal factors and in doing so, address these difficulties. I illustrate this approach by linking matrix verbs in the canonical embedded passive to the participles they select. Quantitative network metrics yield a measure of verb productivity, while a community detection algorithm groups matrix verbs based on the participles they select for. I show that this latter factor clusters verbs by semantic likeness. In applying these derived factors to tens of thousands of acceptability ratings, I demonstrate that computationally derived language-internal factors can make intuitive sense, significantly correlate with linguistic data, and contribute to our understanding of a linguistic phenomenon. As such, this approach has useful applications for the study of linguistic variation beyond the AEP and beyond semantic constraints.