Regularized learning with feature networks
In this thesis, we present Regularized Learning with Feature Networks (RLFN), an approach for regularizing the feature weights of a regression model when one has prior beliefs about which regression coefficients are similar in value. These similarities are specified as a feature network whose semantic interpretation is that each feature coefficient should both lead to accurate predictions and be close to the average of its network neighbors. I.e., during learning, RLFN seeks a model with low training error whose feature weights vary smoothly over the network. RLFN induces a Gaussian prior over the feature weights and can be viewed as a soft variant of the dimensionality reduction technique, Locally Linear Embedding (LLE). RLFN differs from LLE by shrinking weights towards a low dimensional manifold rather than constraining them to lie on such a manifold. Experiments on synthetic data show that if the feature network is sparse, RLFN copes more robustly than LLE with "bad edges" that connect weights which should not be similar. Experiments on MRI brain images show that RLFN combined with an L1 penalty yields 92% accuracy in discriminating the scans of Alzheimer's and frontotemporal lobar degeneration patients. Here, RLFN yields superior performance to the elastic net, another technique for sparse learning which combines L1 and L 2 regularization. On standard learning benchmarks like document and handwritten digit classification, RLFN yields consistently better performance than ridge regression. On document classification, RLFN performs comparably to two semi-supervised learning methods. Extensions to RLFN allow for the modeling of disjoint classes of features whose weights are believed to be Gaussian distributed around shared but unknown means. These extensions yield improved performance on sentiment analysis of on-line product reviews and on extracting the mentions of genes from biomedical abstracts.
Sandler, S. Ted, "Regularized learning with feature networks" (2009). Dissertations available from ProQuest. AAI3414215.