Technical Reports (CIS)

Document Type

Technical Report

Date of this Version

August 1988


University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-88-65.


In this paper, we present a parsing strategy that arose from the development of an Earley-type parsing algorithm for TAGs (Schabes and Joshi 1988) and from some recent linguistic work in TAGs (Abeillé: 1988a).

In our approach, each elementary structure is systematically associated with a lexical head. These structures specify extended domains of locality (as compared to a context-free grammar) over which constraints can be stated. These constraints either hold within the elementary structure itself or specify what other structures can be composed with a given elementary structure. The 'grammar' consists of a lexicon where each lexical item is associated with a finite number of structures for which that item is the head. There are no separate grammar rules. There are, of course, 'rules' which tell us how these structures are composed. A grammar of this form will be said to be 'lexicalized'.

We show that in general context-free grammars cannot be 'lexicalized'. We then show how a 'lexicalized' grammar naturally follows from the extended domain of locality of TAGs and examine briefly some of the linguistic implications of our approach.

A general parsing strategy for 'lexicalized' grammars is discussed. In the first stage, the parser selects a set of elementary structures associated with the lexical items in the input sentence, and in the second stage the sentence is parsed with respect to this set. The strategy is independent of nature of the elementary structures in the underlying grammar. However, we focus our attention on TAGs. Since the set of trees selected at the end of the first stage is not infinite, the parser can use in principle any search strategy. Thus, in particular, a top-down strategy can be used since problems due to recursive structures are eliminated.

We then explain how the Earley-type parser for TAGs can be modified to take advantage of this approach.



Date Posted: 01 November 2007