Bayesian Nonparametric Analysis Of Spatial Variation With Discontinuities

Loading...
Thumbnail Image
Degree type
Doctor of Philosophy (PhD)
Graduate group
Statistics
Discipline
Subject
Areal data
Clustering
Crime
Multiresolution
Prior choice
Spatial Smoothing
Statistics and Probability
Funder
Grant number
License
Copyright date
2021-08-31T20:20:00-07:00
Distributor
Related resources
Author
Balocchi, Cecilia
Contributor
Abstract

Spatial data often display high levels of smoothness but can simultaneously present abrupt discontinuities, especially in urban environments. In this dissertation we adopt a Bayesian perspective to account for these two contrasting facts, using partitions of areal data, and we then focus on three challenges that arise in this setting. First, we consider the applied problem of modeling crime trends over time in Philadelphia, measured at a local neighborhood level. We find that spatially local shrinkage imposed by a conditional autoregressive (CAR) model has substantial benefits in terms of out-of-sample predictive accuracy of crime. We also detect spatial discontinuities between neighborhoods that represent barriers. Then, we extend our search for barriers by clustering areal data. We propose a model that induces smoothness within clusters but allows for discontinuities between them, by assuming a ''CAR-within-clusters'' structure. The first challenge introduced by spatial clustering is that the combinatorially vast space of partitions makes typical stochastic search techniques computationally prohibitive. We introduce an ensemble optimization procedure that summarizes the posterior by simultaneously targeting several high probability partitions. We show on simulated data that our method achieves good estimation and partition selection performance. On the Philadelphia data we find that many recovered borders coincide with natural or built man-made barriers. The second challenge consists in choosing a distribution over partitions: standard distributions for exchangeable partitions are not appropriate for spatial data. We review and compare the properties of distributions for partitions of areal data that have been proposed in the literature and introduce new ones that display favorable properties. The third challenge relates to the problem of working with multiple granularities: fixing one resolution can be restrictive because different granularities can be appropriate for different parts of a city. We introduce a model that combines the Nested Dirichlet Process with the Hierarchical Dirichlet Process to allow for flexible partitions of multi-resolution data and sharing of information between the partitions at different resolutions. We demonstrate our method on synthetic data and on real data in West Philadelphia, where central and suburban areas seem to be better represented by higher and lower resolutions, respectively.

Advisor
Edward I. George
Shane T. Jensen
Date of degree
2020-01-01
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation