Statistics Papers

Document Type

Journal Article

Date of this Version

2015

Publication Source

The Annals of Statistics

Volume

43

Issue

1

Start Page

139

Last Page

176

DOI

10.1214/14-AOS1269

Abstract

We consider the testing and estimation of change-points—locations where the distribution abruptly changes—in a data sequence. A new approach, based on scan statistics utilizing graphs representing the similarity between observations, is proposed. The graph-based approach is nonparametric, and can be applied to any data set as long as an informative similarity measure on the sample space can be defined. Accurate analytic approximations to the significance of graph-based scan statistics for both the single change-point and the changed interval alternatives are provided. Simulations reveal that the new approach has better power than existing approaches when the dimension of the data is moderate to high. The new approach is illustrated on two applications: The determination of authorship of a classic novel, and the detection of change in a network over time.

Copyright/Permission Statement

The original and published work is available at: https://projecteuclid.org/euclid.aos/1416322039#abstract

Keywords

change-point, graph-based tests, nonparametrics, scan statistic, tail probability, high-dimensional data, complex data, network data, non-Euclidean data

Share

COinS
 

Date Posted: 27 November 2017

This document has been peer reviewed.