Matrix Factorization Under Contamination
Degree type
Graduate group
Discipline
Subject
robust statistics
Computer Sciences
Funder
Grant number
License
Copyright date
Distributor
Related resources
Author
Contributor
Abstract
In the nonnegative matrix factorization problem, the user inputs a nonnegative matrix V and wants to factor V to WH, with both W and H nonnegative. Standard factorization techniques make unrealistic assumptions about the noise present in the data: that the noise is generated from independent and identically distributed Gaussian process. However, real world datasets are unlikely to satisfy this simplistic assumption. In particular, real world datasets suffer from contamination, anomalies, and outliers that cannot be modeled by simple Gaussian distributions. In this dissertation, we discuss novel techniques for matrix factorization under contamination and non-standard noise models. These techniques can be used both as a replacement for a standard factorization algorithm, or as an independent contamination detection procedure. We also prove a number of complexity bounds on the hardness of the problem.