Statistics Papers

Document Type

Journal Article

Date of this Version


Publication Source

Annals of Applied Statistics





Start Page


Last Page





A sensitivity analysis in an observational study determines the magnitude of bias from nonrandom treatment assignment that would need to be present to alter the qualitative conclusions of a naïve analysis that presumes all biases were removed by matching or by other analytic adjustments. The power of a sensitivity analysis and the design sensitivity anticipate the outcome of a sensitivity analysis under an assumed model for the generation of the data. It is known that the power of a sensitivity analysis is affected by the choice of test statistic, and, in particular, that a statistic with good Pitman efficiency in a randomized experiment, such as Wilcoxon’s signed rank statistic, may have low power in a sensitivity analysis and low design sensitivity when compared to other statistics. For instance, for an additive treatment effect and errors that are Normal or logistic or t-distributed with 3 degrees of freedom, Brown’s combined quantile average test has Pitman efficiency close to that of Wilcoxon’s test but has higher power in a sensitivity analysis, while a version of Noether’s test has poor Pitman efficiency in a randomized experiment but much higher design sensitivity so it is vastly more powerful than Wilcoxon’s statistic in a sensitivity analysis if the sample size is sufficiently large. A new exact distribution-free test is proposed that rejects if either Brown’s test or Noether’s test rejects after adjusting the two critical values so the overall level of the combined test remains at α, conventionally α = 0.05. In every sampling situation, the design sensitivity of the adaptive test equals the larger of the two design sensitivities of the component tests. The adaptive test exhibits good power in sensitivity analyses asymptotically and in simulations. In one sampling situation—Normal errors and an additive effect that is three-quarters of the standard deviation with 500 matched pairs—the power of Wilcoxon’s test in a sensitivity analysis was 2% and the power of the adaptive test was 87%. A study of treatments for ovarian cancer in the Medicare population is discussed in detail.


Brown's test, combined quantile averages, design sensitivity, Noether's Test, observational study, randomization inference, sensitivity analysis, Wilcoxon's signed rank test



Date Posted: 27 November 2017

This document has been peer reviewed.