Characterizing mildly context-sensitive grammar formalisms

David Jeremy Weir, University of Pennsylvania


This thesis involves the study of formal properties of grammatical formalisms that are relevant to computational linguists. The formalisms which will receive the most attention share the property that they are highly restricted in their generative power. Recent research suggests that Context-Free Grammars (CFG's) lack the necessary expressive power on which to base a linguistic theory. This has led computational linguists to consider grammatical formalisms whose generative power exceeds CFG's, but only to a limited extent. We compare a number of formalisms on the basis of their weak generative capacity, as well as suggesting ways in which they can be compared on the basis of their strong generative capacity. In particular, we consider properties of their structural descriptions (or tree sets); and the types of dependencies (nested, crossed, etc.) that can be exhibited by each formalism. Several formalisms that are notationally quite different (Tree Adjoining Grammars, Head Grammars, and Linear Indexed Grammars) have been shown to be weakly equivalent. We show that Combinatory Categorical Grammars are weakly equivalent to these formalisms. The class of languages generated by these formalisms can be thought of one step up from CFG's, and we describe a number of progressions that illustrate this. The string languages generated by TAL's, HL's, CCL's and LIL's exhibit limited crossed-serial dependencies in addition to those produced by Context-Free Grammars (nested and serial dependencies). By formalizing these crossed-serial dependencies and their relationship with the nested dependencies produced by CFG's we define an infinite progression of formalisms. Our work on structural descriptions leads us to characterize a class of formalisms called Linear Context-Free Rewriting Systems (LCFRS's), which includes a wide range of grammatical formalisms with restricted power. The systems in this class have context-free derivations, and simple composition operations that are linear and nonerasing. We prove that all members of this family generate only semilinear languages that can be recognized in polynomial time.

Subject Area

Computer science|Linguistics

Recommended Citation

Weir, David Jeremy, "Characterizing mildly context-sensitive grammar formalisms" (1988). Dissertations available from ProQuest. AAI8908403.