## IRCS Technical Reports Series

#### Document Type

Technical Report

#### Date of this Version

May 1996

#### Abstract

Over the last ten or fifteen years there has been a shift in generative linguistics away from formalisms based on a procedural interpretation of grammars towards constraint-based formalisms—formalisms that define languages by specifying a set of constraints that characterize the set of well-formed structures analyzing the strings in the language. A natural extension of this trend is to define this set of structures model-theoretically—to define it as the set of mathematical structures that satisfy some set of logical axioms. This approach raises a number of questions about the nature of linguistic theories and the role of grammar formalisms in expressing them.

We argue here that the crux of what theories of syntax have to say about language lies in the abstract properties of the sets of structures they license. This is the level that is most directly connected to the empirical basis of these theories and it is the level at which it is possible to make meaningful comparisons between the approaches. From this point of view, grammar formalisms, or (formal frameworks) are primarily means of presenting these properties. Many of the apparent distinctions between formalisms, then, may well be artifacts of their presentation rather than substantive distinctions between the properties of the structures they license. The model-theoretic approach offers a way in which to abstract away from the idiosyncrasies of these presentations.

Having said that, we must distinguish between the class of sets of structures licensed by a linguistic theory and the set of structures licensed by a specific instance of the theory—by a grammar expressing that theory. Theories of syntax are not simply accounts of the structure of individual languages in isolation, but rather include assertions about the organization of the structure of human languages in general. These universal aspects of the theories present two challenges for the model-theoretic approach. First, they frequently are not properties of individual structures, but are rather properties of sets of structures. Thus, in capturing these model-theoretically one is not defining sets of structures but is rather defining classes of sets of structures; these are not first order properties. Secondly, the universal aspects of linguistic theories are frequently not explicit, but are consequences of the nature of the formalism that embodies the theory. In capturing these one must develop an explicit axiomatic treatment of the formalism. This is both a challenge and a powerful beneft of the approach. Such re-interpretations tend to raise a variety of issues that are often overlooked in the original formalization.

In this report we examine these issues within the context of a model-theoretic reinterpretation of Generalized Phrase-Structure Grammar. While there is little current active research on GPSG, it provides an ideal laboratory for exploring these issues. First, the formalism of GPSG is expressly intended to embody a great deal of the accompanying linguistic theory. Thus it provides a variety of opportunities for examining principles expressed as restrictions on the formalism from a model-theoretic point of view. At the same time, the fact that these restrictions embody universal grammar principles provides us with a variety of opportunities to explore the way in which the linguistic theory expressed by a grammar can transcend the mathematical theory of the structures it licenses. Finally, GPSG, although defined declaratively, is a formalism with restricted generative capacity, a characteristic more typical of the earlier procedural formalisms. As such, one component of the theory it embodies is a claim about the language-theoretic complexity of natural languages. Such claims are difficult to establish for any of the constraint-based approaches to grammar. We can show, however, that the class of sets of trees that are definable within the logical language we employ in reformalizing GPSG is nearly exactly the class of sets of trees definable within the basic GPSG formalism. Thus we are able to capture the language-theoretic consequences of GPSGs restricted formalism by employing a restricted logical language.

**Date Posted:** 07 August 2006

## Comments

University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS 96-10.