Shape Detection by Packing Contours and Regions

Humans have an amazing ability to localize and recognize object shapes from nat- ural images with various complexities, such as low contrast, overwhelming background clutter, large shape deformation and significant occlusion. We typically recognize object shape as a whole - the entire geometric configuration of image tokens and the context they are in. Detecting shape as a global pattern involves two key issues: model representation and bottom-up grouping. A proper model captures long range geometric constraints among image tokens. Contours or regions that are grouped from bottom-up often appear as half complete shapes that are easily recognizable. The main challenge arises from the representation gap between image and model: fragmented image structures usually do not correspond to semantically meaningful model parts. This thesis presents Contour Packing, a novel framework that detects shapes in a global and integral way, effectively bridging this representation gap. We rst develop a grouping mechanism that organizes individual edges into long contours, by encoding Gestalt factors of proximity, continuity, collinearity, and closure in a graph. The contours are characterized by their topologically ordered 1D structures, against otherwise chaotic 2D image clutter. Used as integral shape matching units, they are powerful for preventing accidental alignment to isolated edges, dramatically reducing false shape detections in clutter. We then propose a set-to-set shape matching paradigm that measures and compares holistic shape configurations. Representing both the model and the image as a set of contours, we seek packing a subset of image contours into a complete shape formed by model contours. The holistic configuration is captured by shape features with a large spatial extent, and the long-range contextual relationships among contours. The unique feature of this approach is the ability to overcome unpredictable contour fragmentations. Computationally, set-to-set matching is a hard combinatorial problem. We propose a linear programming (LP) formulation for effciently searching over exponentially many contourconfigurations. We also develop a primal-dual packing algorithm to quickly bound and prune solutions without actually running the LPs. Finally, we generalize set-to-set shape matching on more sophisticated structures arising from both the model and the image. On the model side, we enrich the representation by compactly encoding part conguration selection in a tree, making holistic matching applicable to articulated objects. On the image side, we extend contour packing to regions, which has a fundamentally different topology. Bipartite graph packing is designed to cope with this change. A formulation by semidefinite program ming (SDP) provides an efficient computational solution to this NP-hard problem, and the flexibility of expressing various bottom-up grouping cues.

Advisor

Jianbo Shi

Date of degree

2011-05-16

Collection

Dissertations and Theses