Date of this Version
Journal of Computational and Graphical Statistics
This article describes how a mixture of two densities, f0 and f1, may be decomposed into a different mixture consisting of three densities. These new densities, f+, f−, and f=, summarize differences between f0 and f1: f+ is high in areas of excess of f1 compared to f0; f− represents deficiency of f1 compared to f0 in the same way; f= represents commonality between f1 and f0. The supports of f+ and f− are disjoint. This decomposition of the mixture of f0 and f1 is similar to the set-theoretic decomposition of the union of two sets A and B into the disjoint sets A\B, B\A, and A ∩ B. Sample points from f0 and f1 can be assigned to one of these three densities, allowing the differences between f0 and f1 to be visualized in a single plot, a visual hypothesis test of whether f0 is equal to f1. We describe two similar such decompositions and contrast their behavior under the null hypothesis f0 = f1, giving some insight into how such plots may be interpreted. We present two examples of uses of these methods: visualization of departures from independence, and of a two-class classification problem. Other potential applications are discussed.
classification, data visualization, density estimation, exploratory data analysis, mixture decomposition
Gous, A., & Buja, A. (2004). Visual Comparison of Datasets Using Mixture Decompositions. Journal of Computational and Graphical Statistics, 13 (1), 1-19. http://dx.doi.org/10.1198/1061860043119
Date Posted: 27 November 2017
This document has been peer reviewed.