\documentclass[11pt]{article} \setlength{\oddsidemargin}{0.0truein} \setlength{\evensidemargin}{0.0truein} \setlength{\textwidth}{6.5truein} \setlength{\topmargin}{0.0truein} \setlength{\textheight}{9.0truein} \setlength{\headsep}{0.0truein} \setlength{\headheight}{0.0truein} \setlength{\topskip}{10.0pt} \setlength{\parskip}{5mm} \usepackage{url} \usepackage{amsmath} \usepackage{amssymb} \pagestyle{empty} \begin{document} \begin{center} \textbf{\Large{\textsc{STANFORD UNIVERSITY}}}\\[5pt] \textbf{\Large{\textsc{DEPARTMENT OF STATISTICS}}}\\[5pt] \Large{\textsc{DEPARTMENTAL SEMINAR}} \end{center} % In the following statements, replace "Time of talk", % "Weekday", and "Date of talk". An example is provided. % If you are not sure about this, just skip this part. \begin{center} Time of talk, Weekday, Date of talk\\ 4:15 p.m., Tuesday, March 11, 2008\\ Sequoia Hall Room 200\\ (Cookies at 3:45 in 1st Floor Lounge) \end{center} % In the following statements, replace "Name of the speaker" with your % name, "Department Affiliation" with your department affiliation, and %"University Affiliation" with your university affiliation. \begin{center} \textsl{Yoram Singer} \\ Google \end{center} % In the following statements, replace "Title of the talk" % with your title of the talk. \begin{center} \subsection*{Efficient Projections Algorithms onto the $\ell_1$ Ball for Learning Sparse Representations from High Dimension Data} \end{center} % In the following statements, replace "Abstract of the talk" % with your abstract. \noindent We describe efficient algorithms for projecting a vector onto the $\ell_1$-ball. We present two projection methods. The first method performs exact projection in $O(n)$ time, where $n$ is the dimension of the space. The second method works on vectors $k$ of whose elements are perturbed outside the $\ell_1$-ball, projecting in $O(k\log(n))$ time. This setting is especially useful for online learning in sparse feature spaces, such as text categorization applications. We demonstrate the merits and effectiveness of our algorithms in numerous batch and online statistical learning tasks. We show that variants of gradient projection methods augmented with our efficient projection procedures outperform interior point methods, which are considered state-of-the-art optimization techniques. For least squares problems we show that the algorithm is competitive with a coordinate descent algorithm that was tailored to the problem. We also show that in online settings gradient updates with $\ell_1$ projections outperform the entropic descent and exponentiated gradient algorithms while obtaining models with high degrees of sparsity. To conclude, we present general analysis for online convex programming with strongly convex functions and show that all algorithms we discuss attain logarithmic regret. \end{document}