Lee, Daniel D

Email Address
ORCID
Disciplines
Research Projects
Organizational Units
Position
Introduction
Research Interests

Search Results

Now showing 1 - 10 of 21
  • Publication
    Bayesian L1-Norm Sparse Learning
    (2006-05-19) Lee, Daniel D; Lin, Yuanqing
    We propose a Bayesian framework for learning the optimal regularization parameter in the L1-norm penalized least-mean-square (LMS) problem, also known as LASSO [1] or basis pursuit [2]. The setting of the regularization parameter is critical for deriving a correct solution. In most existing methods, the scalar regularization parameter is often determined in a heuristic manner; in contrast, our approach infers the optimal regularization setting under a Bayesian framework. Furthermore, Bayesian inference enables an independent regularization scheme where each coefficient (or weight) is associated with an independent regularization parameter. Simulations illustrate the improvement using our method in discovering sparse structure from noisy data.
  • Publication
    Grassmann Discriminant Analysis: a Unifying View on Subspace-Based Learning
    (2008-07-06) Hamm, Jihun; Lee, Daniel D
    In this paper we propose a discriminant learning framework for problems in which data consist of linear subspaces instead of vectors. By treating subspaces as basic elements, we can make learning algorithms adapt naturally to the problems with linear invariant structures. We propose a unifying view on the subspace-based learning method by formulating the problems on the Grassmann manifold, which is the set of fixed-dimensional linear subspaces of a Euclidean space. Previous methods on the problem typically adopt an inconsistent strategy: feature extraction is performed in the Euclidean space while non-Euclidean distances are used. In our approach, we treat each subspace as a point in the Grassmann space, and perform feature extraction and classification in the same space. We show feasibility of the approach by using the Grassmann kernel functions such as the Projection kernel and the Binet-Cauchy kernel. Experiments with real image databases show that the proposed method performs well compared with state-of- the-art algorithms.
  • Publication
    Multiplicative Updates for Large Margin Classifiers
    (2003-08-24) Saul, Lawrence K; Sha, Fei; Lee, Daniel D
    Various problems in nonnegative quadratic programming arise in the training of large margin classifiers. We derive multiplicative updates for these problems that converge monotonically to the desired solutions for hard and soft margin classifiers. The updates differ strikingly in form from other multiplicative updates used in machine learning. In this paper, we provide complete proofs of convergence for these updates and extend previous work to incorporate sum and box constraints in addition to nonnegativity.
  • Publication
    Relevant Deconvolution For Acoustic Source Estimation
    (2005-03-18) Lin, Yuanqing; Lee, Daniel D
    We describe a robust deconvolution algorithm for simultaneously estimating an acoustic source signal and convolutive filters associated with the acoustic room impulse responses from a pair of microphone signals. In contrast to conventional blind deconvolution techniques which rely upon a knowledge of the statistics of the source signal, our algorithm exploits the nonnegativity and sparsity structure of room impulse responses. The algorithm is formulated as a quadratic optimization problem with respect to both the source signal and filter coefficients, and proceeds by iteratively solving the optimization in two alternating steps. In the H-step, the nonnegative filter coefficients are optimally estimated within a Bayesian framework using a relevant set of regularization parameters. In the S-step, the source signal is estimated without any prior assumption on its statistical distribution. The resulting estimates converge to a relevant solution exhibiting appropriate sparseness in the filters. Simulation results indicate that the algorithm is able to precisely recover both the source signal and filter coefficients, even in the presence of large ambient noise.
  • Publication
    Multiplicative Updates for Nonnegative Quadratic Programming
    (2007-01-01) Lin, Yuanqing; Sha, Fei; Lee, Daniel D; Saul, Lawrence K.
    Many problems in neural computation and statistical learning involve optimizations with nonnegativity constraints. In this article, we study convex problems in quadratic programming where the optimization is confined to an axis-aligned region in the nonnegative orthant. For these problems, we derive multiplicative updates that improve the value of the objective function at each iteration and converge monotonically to the global minimum. The updates have a simple closed form and do not involve any heuristics or free parameters that must be tuned to ensure convergence. Despite their simplicity, they differ strikingly in form from other multiplicative updates used in machine learning.We provide complete proofs of convergence for these updates and describe their application to problems in signal processing and pattern recognition.
  • Publication
    An Information Maximization Approach to Overcomplete and Recurrent Representations
    (2000-11-27) Shriki, Oren; Sompolinsky, Haim; Lee, Daniel D
    The principle of maximizing mutual information is applied to learning overcomplete and recurrent representations. The underlying model consists of a network of input units driving a larger number of output units with recurrent interactions. In the limit of zero noise, the network is deterministic and the mutual information can be related to the entropy of the output units. Maximizing this entropy with respect to both the feedforward connections as well as the recurrent interactions results in simple learning rules for both sets of parameters. The conventional independent components (ICA) learning algorithm can be recovered as a special case where there is an equal number of output units and no recurrent connections. The application of these new learning rules is illustrated on a simple two-dimensional input example.
  • Publication
    Nonnegative deconvolution for time of arrival estimation
    (2004-05-17) Lin, Yuanqing; Lee, Daniel D; Saul, Lawrence K
    The interaural time difference (ITD) of arrival is a primary cue for acoustic sound source localization. Traditional estimation techniques for ITD based upon cross-correlation are related to maximum-likelihood estimation of a simple generative model. We generalize the time difference estimation into a deconvolution problem with nonnegativity constraints. The resulting nonnegative least squares optimization can be efficiently solved using a novel iterative algorithm with guaranteed global convergence properties. We illustrate the utility of this algorithm using simulations and experimental results from a robot platform.
  • Publication
    Learning a Manifold-Constrained Map between Image Sets: Applications to Matching and Pose Estimation
    (2006-06-01) Ham, Jihun; Ahn, Ikkjin; Lee, Daniel D
    This paper proposes a method for matching two sets of images given a small number of training examples by exploiting the underlying structure of the image manifolds. A nonlinear map from one manifold to another is constructed by combining linear maps locally defined on the tangent spaces of the manifolds. This construction imposes strong constraints on the choice of the maps, and makes possible good generalization of correspondences between all of the image sets. This map is flexible enough to approximate an arbitrary diffeomorphism between manifolds and can serve many purposes for applications. The underlying algorithm is a non-iterative efficient procedure whose complexity mainly depends on the number of matched training examples and the dimensionality of the manifold, and not on the number of samples nor on the dimensionality of the images. Several experiments were performed to demonstrate the potential of our method in image analysis and pose estimation. The first example demonstrates how images from a rotating camera can be mapped to the underlying pose manifold. Second, computer generated images from articulating toy figures are matched using the underlying 4 dimensional manifold to generate image-driven animations. Finally, two sets of actual lip images during speech are matched by their appearance manifold. In all these cases, our algorithm is able to obtain reasonable matches between thousands of large-dimensional images, with a minimum of computation.
  • Publication
    Online, self-supervised terrain classification via discriminatively trained submodular Markov random fields
    (2008-05-19) Vernaza, Paul; Taskar, Ben; Lee, Daniel D
    The authors present a novel approach to the task of autonomous terrain classification based on structured prediction. We consider the problem of learning a classifier that will accurately segment an image into "obstacle" and "ground" patches based on supervised input. Previous approaches to this problem have focused mostly on local appearance; typically, a classifier is trained and evaluated on a pixel-by-pixel basis, making an implicit assumption of independence in local pixel neighborhoods. We relax this assumption by modeling correlations between pixels in the submodular MRF framework. We show how both the learning and inference tasks can be simply and efficiently implemented-exact inference via an efficient max flow computation; and learning, via an averaged-subgradient method. Unlike most comparable MRF-based approaches, our method is suitable for implementation on a robot in real-time. Experimental results are shown that demonstrate a marked increase in classification accuracy over standard methods in addition to real-time performance.
  • Publication
    Multiplicative Updates for Classification by Mixture Models
    (2001-12-03) Saul, Lawrence K; Lee, Daniel D
    We investigate a learning algorithm for the classification of nonnegative data by mixture models. Multiplicative update rules are derived that directly optimize the performance of these models as classifiers. The update rules have a simple closed form and an intuitive appeal. Our algorithm retains the main virtues of the Expectation-Maximization (EM) algorithm—its guarantee of monotonic improvement, and its absence of tuning parameters—with the added advantage of optimizing a discriminative objective function. The algorithm reduces as a special case to the method of generalized iterative scaling for log-linear models. The learning rate of the algorithm is controlled by the sparseness of the training data. We use the method of nonnegative matrix factorization (NMF) to discover sparse distributed representations of the data. This form of feature selection greatly accelerates learning and makes the algorithm practical on large problems. Experiments show that discriminatively trained mixture models lead to much better classification than comparably sized models trained by EM.