Multiplicative Updates for Classification by Mixture Models

Loading...
Thumbnail Image
Penn collection
Departmental Papers (CIS)
Degree type
Discipline
Subject
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract

We investigate a learning algorithm for the classification of nonnegative data by mixture models. Multiplicative update rules are derived that directly optimize the performance of these models as classifiers. The update rules have a simple closed form and an intuitive appeal. Our algorithm retains the main virtues of the Expectation-Maximization (EM) algorithm—its guarantee of monotonic improvement, and its absence of tuning parameters—with the added advantage of optimizing a discriminative objective function. The algorithm reduces as a special case to the method of generalized iterative scaling for log-linear models. The learning rate of the algorithm is controlled by the sparseness of the training data. We use the method of nonnegative matrix factorization (NMF) to discover sparse distributed representations of the data. This form of feature selection greatly accelerates learning and makes the algorithm practical on large problems. Experiments show that discriminatively trained mixture models lead to much better classification than comparably sized models trained by EM.

Advisor
Date of presentation
2001-12-03
Conference name
Departmental Papers (CIS)
Conference dates
2023-05-16T22:31:24.000
Conference location
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Copyright MIT Press. Postprint version. Published in Advances in Neural Information Processing Systems 14, Volume 2, pages 897-904. Proceedings of the 15th annual Neural Information Processing Systems (NIPS) conference, held in British Columbia, Canada, from 3-8 December 2001.
Copyright MIT Press. Postprint version. Published in Advances in Neural Information Processing Systems 14, December 2001, pages 897-904.
Recommended citation
Collection