|
Education
Course Highlight
Machine Learning
STATS 306B: Unsupervised learning
(topics)
Unsupervised vs supervised learning
Clustering methods
K-means clustering, K-medoid clustering; choosing the number of clusters. The gap statistic. Silhouette statistic. Prediction strength
Agglomerative hierarchical clustering. Application to DNA microarrays
Vector quantization, tree-structured VQ
Hybrid clustering
Gaussian mixtures; the EM algorithm. Model-based clustering
Unsupervised problem cast as a supervised problem
Self-organizing maps
Principal components; principal surfaces
Factor analysis. Independent components analysis
Multidimensional scaling, ISOMAP, local linear embedding
STATS 315A: Supervised learning
(topics)
Gaussian discriminant analysis
Naive Bayes
Support vector machines
Model selection and feature selection
Least angle regression and the Lasso
SVM path algorithms
cross-validation, bootstrap
Basic expansions and regularization
Fitting curves to data
Generalized additive models
STATS 315B: Tree-based learning methods and ensemble methods
(topics)
Classification & regression trees (CART)
Multivariate adaptive regression splines (MARS)
Bagging
Boosting and additive trees (MART)
Neural networks
Prototype & near-neighbor methods
STATS 315C: Learning from matrix valued data
(topics)
Biplots and heatmaps
Anova models, Rasch models, correspondence analysis
Clustering, biclustering, spectral clustering
SVD, non-negative matrix factorization, and generalizations
PageRank, TrustRank and generalizations
Prediction on graphs
Tensor methods for three way data
Matrix resampling and downsampling
Random matrix theory and Tracy-Widom laws
Graph based algorithms
CS 221: Artificial intelligence
(topics)
CS 369M: Algorithms for modern massive data set analysis
(topics)
Randomized algorithms for matrix problems
Data analysis and machine learning uses of matrix computations
Algorithmic approaches to graph partitioning problems
Novel data-motivated matrix factorizations
Relationship to numerical, statistical, large-scale computational issues
Other coursework
EE 364A: Convex optimization
STATS 305: Intro to Statistical modeling
STATS 306A: Discrete data modeling
(topics)
Discrete distributions: Bernoulli, Binomial, Poisson, Multinomial
Related continuous distributions: Beta, Dirichlet
Chisquare tests
Logistic regression
Loglinear models for contingency tables
Generalized linear models
Bradley-Terry and related models
Rasch and related models
Predicting ordered and unordered categorical values
STATS 324: Multivariate analysis
STATS 362: Monte Carlo sampling
STATS 352: Spatial Statistics
Statistical theory (STATS 300A, B, C); Probability theory (STATS 310A, B, C)
|