Shi, Jianbo

View all metadata

Search Results

Now showing 1 - 10 of 26

Detecting Unusual Activity in Video
(2004-06-27) Zhong, Hua; Shi, Jianbo; Visontai, Mirko
We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from which a prototype–segment co-occurrence matrix is computed. Motivated by a similar problem in document-keyword analysis, we seek a correspondence relationship between prototypes and video segments which satisfies the transitive closure constraint. We show that an important sub-family of correspondence functions can be reduced to co-embedding prototypes and segments to N-D Euclidean space. We prove that an efficient, globally optimal algorithm exists for the co-embedding problem. Experiments on various real-life videos have validated our approach.
Tracking by Planning
(2011-01-01) Gong, Haifeng; Sim, Jiwoong; Likhachev, Maxim; Shi, Jianbo
We introduce a method for tracking multiple people in a cluttered street scene. We use global context to address the challenge of long occlusion by endowing each tracked object with a planning agent. This planner uses context of the street scene, people and other moving objects to reason about pedestrian intended behavior for tracking under occlusion and ambiguity. We extract short but robust trajectories called tracklets by tracking people with a simple appearance model. We formulate the tracking problem as a batch mode optimization, linking tracklets into paths, each with supporting evidence by an agent’s goal directed behavior, and image partial matching along the trajectory gap. We propose a global criteria for consistent linking of the tracklet with planning that can correct local ambiguity in linking. We test our algorithm in a challenging real world setting, where we automatically estimate scene context and intended goals, then track multiple people from a moving camera.
Silhouette-based Human Identification from Body Shape and Gait
(2002-05-20) Collins, Robert T; Gross, Ralph; Shi, Jianbo
Our goal is to establish a simple baseline method for human identification based on body shape and gait. This baseline recognition method provides a lower bound against which to evaluate more complicated procedures. We present a viewpoint dependent technique based on template matching of body silhouettes. Cyclic gait analysis is performed to extract key frames from a test sequence. These frames are compared to training frames using normalized correlation, and subject classification is performed by nearest neighbor matching among correlation scores. The approach implicitly captures biometric shape cues such as body height, width, and body-part proportions, as well as gait cues such as stride length and amount of arm swing. We evaluate the method on four databases with varying viewing angles, background conditions (indoors and outdoors), walk styles and pixels on target.
Conditional Entropies as Over-Segmentation and Under-Segmentation Metrics for Multi-Part Image Segmentation
(2011-01-01) Gong, Haifeng; Shi, Jianbo
In this paper, we define two conditional entropy measures for performance evaluation of general image segmentation. Given a segmentation label map and a ground truth label map, our measures describe their compatibility in two ways. The first one is the conditional entropy of the segmentation given the ground truth, which indicates the oversegmentation rate. The second one is that of the ground truth given the segmentation, which indicates the under-segmentation rate. The two conditional entropies indicate the trade-off between smaller and larger granularities like false positive rate and false negative rate in ROC, and precision and recall in PR curve. Our measures are easy to implement, and involve no threshold or other parameter, have very intuitive explanation and many good theoretical properties, e.g., good bounds, monotonicity, continuity. Experiments show that our measures work well on Berkeley Image Segmentation Benchmark using three segmentation algorithms, Efficient Graph- Based segmentation, Mean Shift and Normalized Cut. We also give an asymmetric similarity measure based on the two entropies and compared it with Variation of Information. The comparison revealled that our method has advantages in many situations.We also checked the coarse-to-fine compatibility of segmentation results with changing parameters and ground truths from different annotators.
Object Detection Combining Recognition and Segmentation
(2007-01-01) Wang, Liming; Shi, Jianbo; Song, Gang; Shang, I-fan
We develop an object detection method combining top-down recognition with bottom-up image segmentation. There are two main steps in this method: a hypothesis generation step and a verification step. In the top-down hypothesis generation step, we design an improved Shape Context feature, which is more robust to object deformation and background clutter. The improved Shape Context is used to generate a set of hypotheses of object locations and figureground masks, which have high recall and low precision rate. In the verification step, we first compute a set of feasible segmentations that are consistent with top-down object hypotheses, then we propose a False Positive Pruning(FPP) procedure to prune out false positives. We exploit the fact that false positive regions typically do not align with any feasible image segmentation. Experiments show that this simple framework is capable of achieving both high recall and high precision with only a few positive training examples and that this method can be generalized to many object classes.
Discrimintive Image Warping with Attribute Flow
(2011-01-01) Zhang, Weiyu; Srinivasan, Praveen; Shi, Jianbo
We address the problem of finding deformation between two images for the purpose of recognizing objects. The challenge is that discriminative features are often transformation-variant (e.g. histogram of oriented gradients, texture), while transformation-invariant features (e.g. intensity, color) are often not discriminative. We introduce the concept of attribute flow which explicitly models how image attributes vary with its deformation. We develop a non-parametric method to approximate this using histogram matching, which can be solved efficiently using linear programming. Our method produces dense correspondence between images, and utilizes discriminative, transformation-variant features for simultaneous detection and alignment. Experiments on ETHZ shape categories dataset show that we can accurately recognize highly deformable objects with few training examples.
Untangling Cycles for Contour Grouping
(2007-01-01) Zhu, Qihui; Song, Gang; Shi, Jianbo
We introduce a novel topological formulation for contour grouping. Our grouping criterion, called untangling cycles, exploits the inherent topological 1D structure of salient contours to extract them from the otherwise 2D image clutter. To define a measure for topological classification robust to clutter and broken edges, we use a graph formulation instead of the standard computational topology. The key insight is that a pronounced 1D contour should have a clear ordering of edgels, to which all graph edges adhere, and no long range entanglements persist. Finding the contour grouping by optimizing these topological criteria is challenging. We introduce a novel concept of circular embedding to encode this combinatorial task. Our solution leads to computing the dominant complex eigenvectors/ eigenvalues of the random walk matrix of the contour grouping graph. We demonstrate major improvements over state-of-the-art approaches on challenging real images.
Many-to-one Contour Matching for Describing and Discriminating Object Shape
(2010-01-01) Srinivasan, Praveen; Zhu, Qihui; Shi, Jianbo
We present an object recognition system that locates an object, identifies its parts, and segments out its contours. A key distinction of our approach is that we use long, salient, bottom-up image contours to learn object shape, and to achieve object detection with the learned shape. Most learning methods rely on one-to-one matching of contours to a model. However, bottom-up image contours often fragment unpredictably. We resolve this difficulty by using many-to-one matching of image contours to a model. To learn a descriptive object shape model, we combine bottom-up contours from a few representative images. The goal is to allow most of the contours in the training images to be many-to-one matched to the model. For detection, our challenges are inferring the object contours and part locations, in addition to object location. Because the locations of object parts and matches of contours are not annotated, they appear as latent variables during training. We use the latent SVM learning formulation to discriminatively tune the many-to-one matching score using the max-margin criterion. We evaluate on the challenging ETHZ shape categories dataset and outperform all existing methods
Discrimintive Image Warping with Attribute Flow
(2011-01-01) Shi, Jianbo; Zhang, Weiyu; Srinivasan, Praveen
We address the problem of finding deformation between two images for the purpose of recognizing objects. The challenge is that discriminative features are often transformation-variant (e.g. histogram of oriented gradients, texture), while transformation-invariant features (e.g. intensity, color) are often not discriminative. We introduce the concept of attribute flow which explicitly models how image attributes vary with its deformation. We develop a non-parametric method to approximate this using histogram matching, which can be solved efficiently using linear programming. Our method produces dense correspondence between images, and utilizes discriminative, transformation-variant features for simultaneous detection and alignment. Experiments on ETHZ shape categories dataset show that we can accurately recognize highly deformable objects with few training examples.
Recognizing objects by piecing together the Segmentation Puzzle
(2007-01-01) Cour, Timothee; Shi, Jianbo
We present an algorithm that recognizes objects of a given category using a small number of hand segmented images as references. Our method first over segments an input image into superpixels, and then finds a shortlist of optimal combinations of superpixels that best fit one of template parts, under affine transformations. Second, we develop a contextual interpretation of the parts, gluing image segments using top-down fiducial points, and checking overall shape similarity. In contrast to previous work, the search for candidate superpixel combinations is not exponential in the number of segments, and in fact leads to a very efficient detection scheme. Both the storage and the detection of templates only require space and time proportional to the length of the template boundary, allowing us to store potentially millions of templates, and to detect a template anywhere in a large image in roughly 0.01 seconds. We apply our algorithm on the Weizmann horse database, and show our method is comparable to the state of the art while offering a simpler and more efficient alternative compared to previous work.

Shi, Jianbo

Email Address

ORCID

Disciplines

Research Projects

Organizational Units

Position

Introduction

Research Interests

Filters

Author

Subject

Date

Type

Publication Type

Settings

Sort By

Results per page

Search Results

Usage statistics

Penn's Heritage