Statistical Analysis and Design of Crowdsourcing Applications

Kapelner, Adam

Statistical Analysis and Design of Crowdsourcing Applications

Files

Kapelner_upenngdas_0175C_11075.pdf (5.5 MB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Statistics

Subject

crowdsourcing
experimentation
machine learning
missing data
natural language processing
statistical methodology
Computer Sciences
Economics
Statistics and Probability

Copyright date

2015-11-16T00:00:00-08:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/28125

View all metadata

Author

Kapelner, Adam

Abstract

This thesis develops methods for the analysis and design of crowdsourced experiments and crowdsourced labeling tasks. Much of this document focuses on applications including running natural field experiments, estimating the number of objects in images and collecting labels for word sense disambiguation. Observed shortcomings of the crowdsourced experiments inspired the development of methodology for running more powerful experiments via matching on-the-fly. Using the label data to estimate response functions inspired work on non-parametric function estimation using Bayesian Additive Regression Trees (BART). This work then inspired extensions to BART such as incorporation of missing data as well as a user-friendly R package.

Advisor

Abba Krieger
Ed George

Date of degree

2014-01-01

Collection

Dissertations and Theses