Scalable Multiclass High-Dimensional Linear Discriminant Analysis via the Randomized Kaczmarz Method
Jocelyn Chi - University of Minnesota Host: Xiang Ji
Gibson Hall 126A3:00 PM
Fisher's linear discriminant analysis (LDA) is a foundational method of dimension reduction for classification that has been useful in a wide range of applications. The goal is to identify an optimal subspace to project the observations onto that simultaneously maximizes between-group variation while minimizing within-group differences. The solution is straightforward when the number of observations is greater than the number of features but difficulties arise in the high dimensional setting, where there are more features than there are observations. Many works have proposed solutions for the high dimensional setting and frequently involve additional assumptions or tuning parameters. We propose a fast and simple iterative algorithm for high dimensional multiclass LDA on large data that is free from these additional requirements and that comes with some guarantees. We demonstrate our algorithm on real data and highlight some results.
December 1 - December 5
December 1
Monday
no events
December 2
Tuesday
no events
December 3
Wednesday
no events
December 4
Thursday
no events
December 5
Friday
Applied and Computational Mathematics
Scalable Multiclass High-Dimensional Linear Discriminant Analysis via the Randomized Kaczmarz Method
Jocelyn Chi - University of Minnesota Host: Xiang Ji
Gibson Hall 126A3:00 PM
Fisher's linear discriminant analysis (LDA) is a foundational method of dimension reduction for classification that has been useful in a wide range of applications. The goal is to identify an optimal subspace to project the observations onto that simultaneously maximizes between-group variation while minimizing within-group differences. The solution is straightforward when the number of observations is greater than the number of features but difficulties arise in the high dimensional setting, where there are more features than there are observations. Many works have proposed solutions for the high dimensional setting and frequently involve additional assumptions or tuning parameters. We propose a fast and simple iterative algorithm for high dimensional multiclass LDA on large data that is free from these additional requirements and that comes with some guarantees. We demonstrate our algorithm on real data and highlight some results.