T. Hastie, R. Tibshirani and J. Friedman (2001) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer. 533+xvi pages. $79.95. In the full springtime of a field, it is rare that some of it's best gardeners take the time to give us a report on what's blooming. Statistical learning is a grafting of offshoots of artificial intelligence, now called machine learning, to the statistical technology of classification, prediction, and forecasting. The authors aim to synthesize of the efforts of these two communities with a view to predicting what types of growth are likely to survive, perhaps with a bit of weeding along the way. This book is important to the psychometric community for many reasons. The contemporary classification literature owes much to the algorithmic approaches to data analysis that flourished here in the sixties and seventies, as well as to the data analysis movement of that period where some valuable lateral thinking by pioneers like Doug Carroll, Jan de Leeuw, Joseph Kruskal, Roger Shepherd and Forrest Young open up approaches not based on classical probability theory and mathematical statistics. And we may add to this the central role of classification, prediction, and what is now called data mining, in the practice and in research in the educational and behavioral sciences. Perhaps it's time for to return to these issues with new enthusiasm and insights, and this book might just be what we need. The book builds in many ways on Brian Ripley's Pattern Recognition (1996) but the unhappy omission of the Leiden group's Gifi volume (Gifi, 1990) from the bibliography suggests that the statistical learning community has something to gain, too. The initial chapters, 1 to 6, contain an overview of statistical material on linear methods for regression and classification. Supervised learning is defined as the use of training samples to develop promising models, followed by the assessment of their performance on validation and test data. Nearest neighbor nonparametric approaches are also presented in order to compare these older tools in terms of the inevitable tradeoff between bias and sampling variance. Basis function expansions of functions, regularization or smoothing and kernel methods are also introduced to support the more extensive use of functional or nonparametric methods in current research. The next two chapters are pivotal; they deal with methods for comparing model performances, assessing model dimensionality, and important algorithms such as EM and MCMC. The authors are right, too, to stress how important model interpretability is to our client communities, something that is a plus for tree-based approaches (Breiman, Friedman, Olshen, and Stone, 1984) but a problem for local or kernel procedures. Chapters 9 through 13 deal with the main business of the book. Additive (Hastie and Tibshirani, 1990) and tree models come first, along with bump hunting and multivariate regression splines, perhaps because the authors themselves are leading contributors in these areas. Boosting methods for enhancing tree-based classification are currently generating a lot of excitement, and there are some important insights into how boosting works, which is essentially by summing a sequence of models for residuals. Neural networks are considered next, and linked to projection pursuit models. Support vector machines, nonparametric discriminant analysis, prototype and nearest neighbor classification methods follow. The final chapter reviews unsupervised learning methods such as cluster analysis, self-organizing maps and variants of principal components analysis, where there is no correct classification or explicit outcome variables to guide model construction. Well chosen data sets of a serious size and clear applied significance, drawn from problems such as character recognition, prostate cancer forecasting, spam detection and microarray analysis, are used in illustrations. The book pioneers the use of color graphics in textbook publishing, and some of the displays are stunning. These authors write well, too. What do we need to know to profit from this book? A fair amount, in my estimation. It is a must for those already working in some of the areas mentioned above, but some previous exposure to concepts such as trees and additive models is also nearly essential since topics are often mentioned and used in earlier chapters, well before they are defined and taken up in detail later. This rather casual attitude towards organization is apt to make the book a tough read for total newcomers and for students. These are not a complaints, however, since the immediacy of the treatment and the excitement that comes with timeliness more than compensate for the sacrifice of polish and exposition that we expect in texts on more mature areas. This is a landmark volume, and this reviewer rates it as a Best Buy. References Breiman, L., Friedman, J, Olshen, R. and Stone, C. (1994) Classification and Regression Trees, Belmont, Calif.: Wadsworth. Gifi, A. (1990) Nonlinear multivariate analysis, New York: Wiley. Hastie, T. and Tibshirani, R. (1993) Generalized Additive Models, London: Chapman and Hall. Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge: Cambridge University Press.