Statistics Papers

Document Type

Journal Article

Date of this Version


Publication Source

Journal of the American Statistical Association





Start Page


Last Page





Instrumental variables have been widely used for estimating the causal effect between exposure and outcome. Conventional estimation methods require complete knowledge about all the instruments’ validity; a valid instrument must not have a direct effect on the outcome and not be related to unmeasured confounders. Often, this is impractical as highlighted by Mendelian randomization studies where genetic markers are used as instruments and complete knowledge about instruments’ validity is equivalent to complete knowledge about the involved genes’ functions.

In this paper, we propose a method for estimation of causal effects when this complete knowledge is absent. It is shown that causal effects are identified and can be estimated as long as less than 50% of instruments are invalid, without knowing which of the instruments are invalid. We also introduce conditions for identification when the 50% threshold is violated. A fast penalized �1 estimation method, called sisVIVE, is introduced for estimating the causal effect without knowing which instruments are valid, with theoretical guarantees on its performance. The proposed method is demonstrated on simulated data and a real Mendelian randomization study concerning the effect of body mass index on health-related quality of life index. An R package sisVIVE is available on CRAN. Supplementary materials for this article are available online

Copyright/Permission Statement

This is an Accepted Manuscript of an article published by Taylor & Francis in Journal of the American Statistical Association.


Body mass index, causal inference, health-related quality of life, instrumental variable, �1 penalization, pleiotropy



Date Posted: 27 November 2017

This document has been peer reviewed.