Date of this Version
In this article, we develop and study methods for evaluating forecasters and forecasting questions in dynamic environments. These methods, based on item response models, are useful in situations where items vary in difficulty, and we wish to evaluate forecasters based on the difficulty of the items that they forecasted correctly. In addition, the methods are useful in situations where we need to compare forecasters who make predictions at different points in time or for different items. We first extend traditional models to handle subjective probabilities, and we then apply a specific model to geopolitical forecasts. We evaluate the model’s ability to accommodate the data, compare the model’s estimates of forecaster ability to estimates of forecaster ability based on scoring rules, and externally validate the model’s item estimates. We also highlight some shortcomings of the traditional models and discuss some further extensions. The analyses illustrate the models’ potential for widespread use in forecasting and subjective probability evaluation.
© American Psychological Association, 2016. This paper is not the copy of record and may not exactly replicate the authoritative document published in the APA journal. Please do not copy or cite without author's permission. The final article is available, upon publication, at: http://dx.doi.org/10.1037/dec0000032
forecasting, probability judgment, item response theory, scoring rules, continuous response model
Merkle, E. C., Steyvers, M., Mellers, B. A., & Tetlock, P. E. (2016). Item Response Models of Probability Judgments: Application to a Geopolitical Forecasting Tournament. Decision, 3 (1), 1-19. http://dx.doi.org/10.1037/dec0000032
Date Posted: 15 June 2018
This document has been peer reviewed.