Fast Linear Algorithms for Machine Learning

Lu, Yichao

Fast Linear Algorithms for Machine Learning

Files

Lu_upenngdas_0175C_11613.pdf (696.72 KB)

Degree type

Doctor of Philosophy (PhD)

Graduate group

Applied Mathematics

Subject

canonical correlation analysis
gradient methods
large scale
linear regression
machine learning
Computer Sciences
Statistics and Probability

Copyright date

2015-07-20T00:00:00-07:00

Permalink

https://repository.upenn.edu/handle/20.500.14332/27867

View all metadata

Author

Lu, Yichao

Abstract

Nowadays linear methods like Regression, Principal Component Analysis and Canoni- cal Correlation Analysis are well understood and widely used by the machine learning community for predictive modeling and feature generation. Generally speaking, all these methods aim at capturing interesting subspaces in the original high dimensional feature space. Due to the simple linear structures, these methods all have a closed form solution which makes computation and theoretical analysis very easy for small datasets. However, in modern machine learning problems it's very common for a dataset to have millions or billions of features and samples. In these cases, pursuing the closed form solution for these linear methods can be extremely slow since it requires multiplying two huge matrices and computing inverse, inverse square root, QR decomposition or Singular Value Decomposition (SVD) of huge matrices. In this thesis, we consider three fast al- gorithms for computing Regression and Canonical Correlation Analysis approximate for huge datasets.

Advisor

Dean P. Foster

Date of degree

2015-01-01

Collection

Dissertations and Theses