Technical Report

January 1997


University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-97-01.


SABLE is a Scalable Architecture for Bilingual LExicography. It is designed to produce clean broad-coverage translation lexicons from raw, unaligned parallel texts. Its black-box functionality makes it suitable for naive users. The architecture has been implemented for different language pairs, and has been tested on very large and noisy input. SABLE does not rely on language-specific resources such as part-of-speech taggers, but it can take advantage of them when they are available.



Date Posted: 29 June 2007