Shi, Jianbo

Email Address
Research Projects
Organizational Units
Research Interests

Search Results

Now showing 1 - 10 of 26
  • Publication
    Discrimintive Image Warping with Attribute Flow
    (2011-01-01) Shi, Jianbo; Zhang, Weiyu; Srinivasan, Praveen
    We address the problem of finding deformation between two images for the purpose of recognizing objects. The challenge is that discriminative features are often transformation-variant (e.g. histogram of oriented gradients, texture), while transformation-invariant features (e.g. intensity, color) are often not discriminative. We introduce the concept of attribute flow which explicitly models how image attributes vary with its deformation. We develop a non-parametric method to approximate this using histogram matching, which can be solved efficiently using linear programming. Our method produces dense correspondence between images, and utilizes discriminative, transformation-variant features for simultaneous detection and alignment. Experiments on ETHZ shape categories dataset show that we can accurately recognize highly deformable objects with few training examples.
  • Publication
    Recognizing objects by piecing together the Segmentation Puzzle
    (2007-01-01) Cour, Timothee; Shi, Jianbo
    We present an algorithm that recognizes objects of a given category using a small number of hand segmented images as references. Our method first over segments an input image into superpixels, and then finds a shortlist of optimal combinations of superpixels that best fit one of template parts, under affine transformations. Second, we develop a contextual interpretation of the parts, gluing image segments using top-down fiducial points, and checking overall shape similarity. In contrast to previous work, the search for candidate superpixel combinations is not exponential in the number of segments, and in fact leads to a very efficient detection scheme. Both the storage and the detection of templates only require space and time proportional to the length of the template boundary, allowing us to store potentially millions of templates, and to detect a template anywhere in a large image in roughly 0.01 seconds. We apply our algorithm on the Weizmann horse database, and show our method is comparable to the state of the art while offering a simpler and more efficient alternative compared to previous work.
  • Publication
    Object Recognition using Boosted Discriminants
    (2001-12-08) Mahamud, Shyjan; Hebert, Martial; Shi, Jianbo
    We approach the task of object discrimination as that of learning efficient "codes" for each object class in terms of responses to a set of chosen discriminants. We formulate this approach in an energy minimization framework. The "code" is built incrementally by successively constructing discriminants that focus on pairs of training images of objects that are currently hard to classify. The particular discriminants that we use partition the set of objects of interest into two well-separated groups. We find the optimal discriminant as well as partition by formulating an objective criteria that measures the well-separateness of the partition. We derive an iterative solution that alternates between the solutions for two generalized eigenproblems, one for the discriminant parameters and the other for the indicator variables denoting the partition. We show how the optimization can easily be biased to focus on hard to classify pairs, which enables us to choose new discriminants one by one in a sequential manner We validate our approach on a challenging face discrimination task using parts as features and show that it compares favorably with the performance of an eigenspace method.
  • Publication
    Shape from Shading: Recognizing the Mountains through a Global View
    (2006-06-01) Zhu, Qihui; Shi, Jianbo
    Resolving local ambiguities is an important issue for shape from shading (SFS). Pixel ambiguities of SFS can be eliminated by propagation approaches. However, patch ambiguities still exist. Therefore, we formulate the global disambiguation problem to resolve these ambiguities. Intuitively, it can be interpreted as flipping patches and adjusting heights such that the result surface has no kinks. The problem i s intractable because exponentially many possible configurations need to be checked. Alternatively, we solve the integrability testing problem closely related to the original one. It can be viewed as finding a surface which satisfies the global integrability constraint. To encode the constraints, we introduce a graph formulation called configuration graph. Searching the solution on this graph can be reduced to a Max-cut problem and its solution is computable using semidefinite programming (SDP) relaxation. Tests carried out on synthetic and real images show that the global disambiguation works well fro complex shapes.
  • Publication
    Detecting Unusual Activity in Video
    (2004-06-27) Zhong, Hua; Shi, Jianbo; Visontai, Mirko
    We present an unsupervised technique for detecting unusual activity in a large video set using many simple features. No complex activity models and no supervised feature selections are used. We divide the video into equal length segments and classify the extracted features into prototypes, from which a prototype–segment co-occurrence matrix is computed. Motivated by a similar problem in document-keyword analysis, we seek a correspondence relationship between prototypes and video segments which satisfies the transitive closure constraint. We show that an important sub-family of correspondence functions can be reduced to co-embedding prototypes and segments to N-D Euclidean space. We prove that an efficient, globally optimal algorithm exists for the co-embedding problem. Experiments on various real-life videos have validated our approach.
  • Publication
    Spectral Segmentation with Multiscale Graph Decomposition
    (2005-06-20) Cour, Timothée; Bénézit, Florence; Shi, Jianbo
    We present a multiscale spectral image segmentation algorithm. In contrast to most multiscale image processing, this algorithm works on multiple scales of the image in parallel, without iteration, to capture both coarse and fine level details. The algorithm is computationally efficient, allowing to segment large images. We use the Normalized Cut graph partitioning framework of image segmentation. We construct a graph encoding pairwise pixel affinity, and partition the graph for image segmentation.We demonstrate that large image graphs can be compressed into multiple scales capturing image structure at increasingly large neighborhood. We show that the decomposition of the image segmentation graph into different scales can be determined by ecological statistics on the image grouping cues. Our segmentation algorithm works simultaneously across the graph scales, with an inter-scale constraint to ensure communication and consistency between the segmentations at each scale. As the results show, we incorporate long-range connections with linear-time complexity, providing high-quality segmentations efficiently. Images that previously could not be processed because of their size have been accurately segmented thanks to this method.
  • Publication
    Multi-hypothesis Motion Planning for Visual Object Tracking
    (2011-01-01) Gong, Haefong; Sim, Jack; Likhachev, Maxim; Shi, Jianbo
    In this paper, we propose a long-term motion model for visual object tracking. In crowded street scenes, persistent occlusions are a frequent challenge for tracking algorithm and a robust, long-term motion model could help in these situations. Motivated by progresses in robot motion planning, we propose to construct a set of ‘plausible’ plans for each person, which are composed of multiple long-term motion prediction hypotheses that do not include redundancies, unnecessary loops or collisions with other objects. Constructing plausible plan is the key step in utilizing motion planning in object tracking, which has not been fully investigate in robot motion planning. We propose a novel method of efficiently constructing disjoint plans in different homotopy classes, based on winding numbers and winding angles of planned paths around all obstacles. As the goals can be specified by winding numbers and winding angles, we can avoid redundant plans in the same homotopy class and multiple whirls or loops around a single obstacle. We test our algorithm on a challenging, real-world dataset, and compare our algorithm with Linear Trajectory Avoidance and a simplified linear planning model. We find that our algorithm outperforms both algorithms in most sequences.
  • Publication
    Grouping Contours Via a Related Image
    (2008-01-01) Srinivasan, Praveen; Wang, Liming; Shi, Jianbo
    Contours have been established in the biological and computer vision literature as a compact yet descriptive representation of object shape. While individual contours provide structure, they lack the large spatial support of region segments (which lack internal structure). We present a method for further grouping of contours in an image using their relationship to the contours of a second, related image. Stereo, motion, and similarity all provide cues that can aid this task; contours that have similar transformations relating them to their matching contours in the second image likely belong to a single group. To find matches for contours, we rely only on shape, which applies directly to all three modalities without modification, in contrast to the specialized approaches developed for each independently. Visually salient contours are extracted in each image, along with a set of candidate transformations for aligning subsets of them. For each transformation, groups of contours with matching shape across the two images are identified to provide a context for evaluating matches of individual contour points across the images. The resulting contexts of contours are used to perform a final grouping on contours in the original image while simultaneously finding matches in the related image, again by shape matching. We demonstrate grouping results on image pairs consisting of stereo, motion, and similar images. Our method also produces qualitatively better results against a baseline method that does not use the inferred contexts.
  • Publication
    Solving Markov Random Fields with Spectral Relaxation
    (2007-01-01) Cour, Timothee; Shi, Jianbo
    Markov Random Fields (MRFs) are used in a large array of computer vision and maching learning applications. Finding the Maximum Aposteriori (MAP) solution of an MRF is in general intractable, and one has to resort to approximate solutions, such as Belief Prop- agation, Graph Cuts, or more recently, ap- proaches based on quadratic programming. We propose a novel type of approximation, Spectral relaxation to Quadratic Program- ming (SQP). We show our method offers tighter bounds than recently published work, while at the same time being computationally efficient. We compare our method to other algorithms on random MRFs in various settings.
  • Publication
    Saliency Based Opportunitstic Search for Object Part Extraction and Labeling
    (2008-01-01) Wu, Yang; Zhu, QIhui; Shi, Jianbo; Zheng, Nanning
    We study the task of object part extraction and labeling, which seeks to understand objects beyond simply identifiying their bounding boxes. We start from bottom-up segmentation of images and search for correspondences between object parts in a few shape models and segments in images. Segments comprising different object parts in the image are usually not equally salient due to uneven contrast, illumination conditions, clutter, occlusion and pose changes. Moreover, object parts may have different scales and some parts are only distinctive and recognizable in a large scale. Therefore, we utilize a multi-scale shape representation of objects and their parts, figural contextual information of the whole object and semantic contextual information for parts. Instead of searching over a large segmentation space, we present a saliency based opportunistic search framework to explore bottom-up segmentation by gradually expanding and bounding the search domain.We tested our approach on a challenging statue face dataset and 3 human face datasets. Results show that our approach significantly outperforms Active Shape Models using far fewer exemplars. Our framework can be applied to other object categories.