Perils and Prospects of Using Aggregate Area Level Socioeconomic Information as a Proxy for Individual Level Socioeconomic Confounders in Instrumental Variables Regression

Loading...
Thumbnail Image
Penn collection
Statistics Papers
Degree type
Discipline
Subject
aggregation
casual inference
instrumental variables
proxy variables
Wald's grouping method
Statistics and Probability
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Hsu, Jesse Yenchih
Lorch, Scott A
Small, Dylan S
Contributor
Abstract

A frequent concern in making statistical inference for causal effects of a policy or treatment based on observational studies is that there are unmeasured confounding variables. The instrumental variable method is an approach to estimating a causal relationship in the presence of unmeasured confounding variables. A valid instrumental variable needs to be independent of the unmeasured confounding variables. It is important to control for the confounding variable if it is correlated with the instrument. In health services research, socioeconomic status variables are often considered as confounding variables. In recent studies, distance to a specialty care center has been used as an instrument for the effect of specialty care vs. general care. Because the instrument may be correlated with socioeconomic status variables, it is important that socioeconomic status variables are controlled for in the instrumental variables regression. However, health data sets often lack individual socioeconomic information but contain area average socioeconomic information from the US Census, e.g., average income or education level in a county. We study the effects on the bias of the two stage least squares estimates in instrumental variables regression when using an area-level variable as a controlled confounding variable that may be correlated with the instrument. We propose the aggregated instrumental variables regression using the concept of Wald’s method of grouping, provided the assumption that the grouping is independent of the errors. We present simulation results and an application to a study of perinatal care for premature infants.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2012-06-01
Journal title
Health Services and Outcomes Research Methodology
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection