Date of this Version
The availability and excessiveness of alternative (non-survey) data sources, collected on a daily, hourly, and sometimes second-by-second basis, has challenged the federal statistical system to update existing protocol for developing official statistics. Federal statistical agencies collect data primarily through survey methodologies built on frames constructed from administrative records. They compute survey weights to adjust for non-response and unequal sampling probabilities, impute answers for nonresponse, and report official statistics via tabulations from these survey. The U.S. federal government has rigorously developed these methodologies since the advent of surveys -- an innovation produced by the urgent desire of Congress and the President to estimate annual unemployment rates of working age men during the Great Depression.
In the 1930s, Twitter did not exist; high-scale computing facilities were not abundant let alone cheap, and the ease of the ether was just a storyline from the imagination of fiction writers. Today we do have the technology, and an abundance of data, record markers, and alternative sources, which, if curated and examined properly, can help enhance official statistics. Researchers at the Census Bureau have been experimenting with administrative records in an effort to understand how these alternative data sources can improve our understanding of official statistics. Innovative projects like these have advanced our knowledge of the limitations of survey data in estimating official statistics. This paper will discuss advances made in linking administrative records to survey data to-date and will summarize the research on the impact of administrative records on official statistics.
Date Posted: 06 December 2018
This document has been peer reviewed.