Document Type

Working Paper

Date of this Version



We identified features that drive differential

accuracy in word sense disambiguation

(WSD) by building regression models using

10,000 coarse-grained WSD instances which

were labeled on Mturk. Features predictive of

accuracy include properties of the target word

(word frequency, part of speech, and number

of possible senses), the example context

(length), and the Turker’s engagement with

our task. The resulting model gives insight

into which words are difficult to disambiguate.

We also show that having many Turkers label

the same instance provides at least a partial

substitute for more expensive annotation.

Included in

Business Commons



Date Posted: 28 October 2014


To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.