Sparsity Regularization for Learning Projections and Classification
Speaker: Yongxin (Taylor) Xi
Series: Final Public Orals
Location: Engineering Quadrangle B327
Date/Time: Monday, June 27, 2011, 11:00 a.m. - 1:00 p.m.
This thesis investigates sparsity inducing regularization in machine learning optimizations, mainly for projection learning and classification problems. We derive theoretical properties for a selection of such regularized optimization problems, and develop efficient algorithms to solve these problems. Theoretical and experimental comparisons show improvements of the regularized approaches over the original methods on model sparsity, classification accuracy and for some cases training speed.
The thesis progresses in an order from projection learning (Separable Principal Component Analysis, Projection learning using sparse regression) to classification problems (Sparsity regularized boosting, Regularized semi-supervised cluster kernel learning). Each chapter investigates a different problem and is generally independent from the others.
We begin with separable Principal Component Analysis (SPCA), a regularized alternative to standard PCA for image dimension reduction. This has the same objective as PCA but with additional constraints to ensure a separable basis. When used with a data base of two or three dimensional images, the proposed
SPCA trains faster than PCA and is more resistant to overfitting than PCA.
Then we investigate a sparse regression framework, where the sparsity constraint is added to select only a few regressors for the linear regression problem. We successfully combine the virtues of Sparse Regression and projection methods such as PCA and FDA to construct a discriminative projection that achieves better classification accuracies then using these projections directly.
Boosting algorithms with l1 regularization are of interest because l1 regularization leads to sparser composite classifiers and results in a margin maximizing classifier in the limit. By deriving an explicit convergence bound, we reveal the relationship between the regularization parameter and the margin achieved. As known algorithms are computationally expensive, we introduce a new hybrid algorithm AdaBoost+L1 that provably converges to the margin maximizing solution.
Finally we present the work on the Semi-supervised Clustering problem, where a few pairwise constraints are given as prior knowledge to guide the cluster formation. Our contribution is a kernel learning framework that incorporates an explicit constraint propagation mechanism. We call the framework Regularized Cluster Kernel Learning with Low Rank Propagation (CKLRP). After emphasizing the importance of constraint propagation, we show theoretically and empirically that CKLRP learns a low rank kernel that is effectively propagated.