Complexity and the Induction of Tree Adjoining Grammars

Loading...
Thumbnail Image
Penn collection
IRCS Technical Reports Series
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Clark, Robin
Contributor
Abstract

In this paper, I will develop the formal foundations of a theory of complexity that underlies theory of grammatical induction. The initial concern will be the learning theoretic foundations of linguistic locality. That is, I will develop a theory that will place bounds on the amount a learner can draw from an input text. These bounds will limit the amount of variation that could potentially be encoded within a parameter space. A fully developed form of the theory will place a tangible upper limit on what the learner can induce from the input text. The formal theory developed establishes a relationship between the complexity of descriptions and their likelihood; that is, the more complex a structure is, the less likely it is to occur. I will use this result to develop a theory of linguistic complexity. I will rely on this relationship to show that the results developed in the first part of the paper for the parameter setting model also hold for the inductive theory. The final sections of the paper turn to the formal specification of the learning model and a description of the linguistic theory that supports it. This section also describes a pair of heuristic constraints on the learner’s search for viable hypotheses. In general, the learner faces a computationally intractable problem in that there are exponentially many grammatical hypotheses for any input text. These constraints, the Adjunction Constraint and the Substitution Constraint, greatly reduce the number of hypotheses that the learner must consider. Furthermore, metrics on the complexity of the learner’s descriptions guarantee that the hypothesis space can be tractably searched for the adult grammar.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
1996-05-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-96-14.
Recommended citation
Collection