Reconstructing the evolutionary history of natural languages

Loading...
Thumbnail Image
Penn collection
IRCS Technical Reports Series
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Warnow, Tandy
Taylor, Ann
Contributor
Abstract

In this paper we present a new methodology for determining the evolutionary history of related languages. Our methodology uses linguistic information encoded as qualitative characters, so that prospective trees can be evaluated according to various optimization criteria, much as is done in the practice of inferring evolutionary history for biological species. By contrast with biology, however, we find that the linguistic data support evolutionary trees with extremely good compatibility scores, and that for such data it is possible to find optimal trees quickly. We have applied this method to the classification of Indo-European (IE) languages; we have been able to resolve one longstanding open problem (the Indo-Hittite hypothesis), and have indicated exactly what needs to be established in order to resolve another longstanding open problem (the Italo-Celtic hypothesis). We have also discovered rather surprising facts about the history of Germanic within this family. Thus, this method provides an ability to resolve difficult questions in Historical Linguistics that have proved resistent to traditional character-based methodologies and to the more recent distance based approaches of lexicostatistics. The results of our methodology also indicate weaknesses in methods currently accepted and practiced in historical linguistics. One of our more important results is the ability to detect and handle loan words that are not distinguishable from more important results is the ability to detect and handle loan words that are not distinguishable from true cognates by traditional methods. Finally, this methodology permits the linguist to develop and test assumptions about the evolutionary relevance of different linguistic characters.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
1995-06-01
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
University of Pennsylvania Institute for Research in Cognitive Science Technical Report No. IRCS-95-16.
Recommended citation
Collection