Faculty Reflections
Tom Cover


Probability theory has always been sexy and exciting.
The excitement comes from the intangibility of the subject of
probability itself. There seems to be nothing physical there. On the
other hand, some very concrete and deterministic theorems come out of
probability theory.
For example, the average of a large number of the independent,
identically distributed random variables has a deterministic limit
equal to the expectation. This is the strong law of large numbers.
There are some other deterministic gems in probability theory--the
central limit theorem, the law of the iterated logarithm and the
ergodic theorem. Can you imagine the excitement of seeing a formula
like

coming out of thin air?
I was led to probability by my interest in gambling and poker. As a
graduate student, I worked out the optimal raising and calling
strategy in simple forms of poker. I also worked out the optimal
doubling strategy in backgammon. Having these aces up my sleeve gave
me the confidence to play as though I were the best. And confidence
leads to good play. I was quite successful in poker as a graduate
student. Later I heard about Thorp's then-unpublished work on beating
blackjack by conditioning play on the distribution of the remaining
cards in the deck. Another graduate student and I made a good deal of
money following this strategy.
The power of probability comes up even at the simplest levels. For
example, the Monty Hall paradox. One has gold behind one of three
curtains and is allowed a chance to switch to one of the other two
curtains after the game host has pulled aside a curtain without gold
behind it. Surprisingly, you are always better off switching. This
illustrates the principle of restricted choice. Also, Bertrand's
paradox and a number of bar bets show the counterintuitive nature of
probabilistic results even at the most primitive level. It is hard to
believe that a subject which becomes counterintuitive so early will
fail to be interesting.
What I'm working on now is mostly in the realm of information theory.
That's another intangible concept. What is the information in an
English sentence? Or in 100 flips of a bent coin? Everyone agrees
that the quantity known as the Shannon entropy is the amount of bits
of randomness in these various examples. I have spent the last ten or
fifteen years trying to find out what the subject of information
really is. To do this, I'm trying to identify the extreme points of
the theory and unify them. In the process one can say much about some
of the old questions like the second law of thermodynamics. Why does
entropy increase and why does there seem to be an arrow of time?
Apparently, it is not true that entropy increases for every Markov
chain. Nonetheless, in the physical world the second law of
thermodynamics says that entropy always increases. Is our Markovian
universe then necessarily restricted to a certain family of what we
might call ``physical" Markov processes? I think not. I think something
else is going on. But what?
Last year I had a chance with Professor Keller in the math department
to create an undergraduate course called ``Mathematics and Sports.''
Here we tried to identify new strategies and new diagnostic statistics
for all of the existing sports. Since sports statistics is one of my
hobbies, I thought this would be an easy job. It wasn't.
Nonetheless, I anticipate teaching a course like this again in two or
three years.
Many years ago Herbert Robbins, one of the great creative minds in the
field of mathematical statistics, was able to show that one can do
better solving several independent statistical problems together
rather than separately. This is a crazy idea because the problems are
independent. Nonetheless, he was right. We tried to apply the same
point of view to portfolio theory to argue that in the presence of a
continuum of clever portfolio investment strategies, you can do as
well as if you had known ahead of time which of the strategies was
best. In particular, you can asymptotically outperform the best
stock.
So the gist of all this is that there are a lot of simple and shocking
statements--some at an elementary level, some at an advanced
level--that come out of probability and statistics. It's the
existence of these as yet unfound statements that drives my interest
in the subject.
top of page
David Donoho - Statistics: The Best-Kept Secret


When I make new acquaintances and say that I am a statistician, I
sometimes observe surprise on their faces as though the existence of a
field called "statistics" is totally new to them. It is
certainly true that statistics has low visibility in the popular
press.
Actually, statisticians are too busy having fun to worry much about
publicizing themselves or what they do. In no other field can you be
(a) part mathematician, (b) part computer hacker, (c) part scientist,
and (d) part ethical conscience of the whole world.
Roles (a)-(d) (and a few others!), are so fulfilling that Public
Relations seems pretty uninteresting by comparison. As a result,
statistics is a near-invisible profession, in the popular mind, and
perhaps also in the undergraduate mind.
This trend is only going to intensify. Computerization,
telecommunications, terabyte-capacity personal computer data storage,
refinements of scientific instruments: these are massive forces in the
world at large which are creating massive new data bases, and new
types of data analysis problems. Soon statisticians will be so much in
demand that just deciding which projects are most interesting will
take up a lot of their time. Already, a statistician has difficult
choices: should he/she analyze satellite remote sensing data about
rain-forest depletion? or time series of global warming? or decode the
human genome? or make sharp, informative real-time images of the
beating heart? or make seismic images of the Earth with a view to
understanding earthquakes and volcanoes? At some point, an essential
singularity is going to occur: statisticians' work time will get
completely swallowed up by all the interesting projects, no one will
have time to do even minimal PR, and the profession will disappear
from public view entirely.
Of course, "disappearing from public view" doesn't mean
that our profession has no chance for glory and prestige. My
undergraduate thesis adviser, John Tukey, received the Presidential
Medal of Science; but I don't think of him as having engaged in PR.
Our relative lack of visibility signifies, to me, that our work
doesn't fit in with the pre-packaged, short attention span of the
popular and political "culture" of the United States of
today. The problems we are working on are a bit too complex to
generate good "sound bites" for TV. When I was an
undergraduate I found the TV "culture" of American society
unsatisfying, and statistics became attractive because of its
opposition to the laxity of thought encouraged by the TV
"culture."
Our concerns, though they change with the changing demands of
science and medicine, are at some level durable. They have too much
integrity to align with prevailing fashion. When I was choosing a
career, some of my classmates thought that by studying the Law they'd
find careers cleaning up governmental problems like Watergate;
instead, some of them ended up doing the paperwork behind junk bond
offerings. My first job as a statistician involved work in oil
exploration; that whole industry collapsed in 1984 happily it turns
out that the work I was doing had applications in communications and
in extragalactic astronomy as well. Moreover, some of the
mathematical questions I started thinking about because of things I
had learned in oil exploration prompted me to solve some problems in
what is basically pure mathematics: inequalities in Fourier
Analysis. Industries can boom and burst, but statistics will always be
there, and statistical work will have applications beyond the industry
or scientific field it was developed for.
Because statistics is relatively hidden from public view,
statisticians feel close to one another in a way I don't observe in
other professions. Statisticians really arent competing against one
another, because they address the problems posed by other disciplines,
and there are so many problems to work on that there's plenty of room
for everyone. As a result, statisticians are genuinely friendly and
welcomin people. I have friends in France, Israel, China, Australia,
Germany, all of whome are statisticians.
I focused here on why you might chose statistics as a career,
over, say, Mathematics, Computer Science, or some cognate field. I
believe that if you like roles (a)--(d) mentioned in paragraph two of
this article, you'll prosper in statistics---in a way you wouldn't
prosper in those other fields. I haven't said much about what I do
these days for research, but if you visit Sequoia Hall and come talk
to me, I'll be happy to chat with you about that.
top of page
Brad Efron


From the time I was a little boy until my senior year in college I
wanted to be a mathematician. Then I learned that I really wanted to
be a 19th century mathematician, the kind who does a little theory, a
lot of computation, and some consulting with real scientists. The
field of statistics has allowed me to do all three things, in whatever
proportions I desired.
Here is an example of three faces of statistics, done in the early
1940's. The naturalist Corbet had spent two years trapping
butterflies in Malaya. At the end of that time he constructed a table
to show how many times he had trapped the various butterfly species.
(See Table 1.) For example,
118 species were so rare that Corbet had trapped only one specimen of
each, 74 species had been trapped twice each, etc.
Table 1: Corbet's data on how often species of
butterflies were trapped
| Frequency |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
| Species |
118 |
74 |
44 |
24 |
29 |
22 |
20 |
19 |
20 |
15 |
12 |
14 |
6 |
12 |
6 |
|
Corbet returned to England with his table, and asked
R. A. Fisher, the greatest of all statisticians, how many new
species he would see if he returned to Malaya for another two years of
trapping. This question seems impossible to answer, since it refers
to a column of Corbet's table that doesn't exist, the ``0'' column.
Fisher provided an interesting answer to the question, which was later
improved on by I. J. Good and Turing
of Turing machine fame.
The Fisher-Good-Turing answer was this: you can expect to trap
new species in two years of additional trapping.
[If you know something about the Poisson distribution you can derive
this formula. Here are some hints.
- Assume that there are N species of butterflies
altogether, and that the i-th one will be trapped a Poisson
number of times in two years, with Poisson parameter
.
- Notice that the probability that species i will not be
trapped in the first two years, but will be trapped in the second two
years, is

The quantity
following the estimate 75 is a ``standard error,'' an estimate of
accuracy for the number 75. My main interest in the past several
years has been in developing computer-based methods for obtaining
standard errors (and other measures of accuracy) in very complicated
situations. Classically, quantities are calculated from formulas that grow
quickly more intricate and less useful as the estimator of interest
gets less like a simple mean. The bootstrap uses the computer to give
a numerical
value without any formula at all.
Table 2: Data on blood cholesterol decrease
| Man |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
| Decrease |
13.75 |
39.5 |
-21.0 |
56.75 |
10.75 |
3.25 |
80.0 |
41.75 |
32.5 |
|
Here is a simple but real example.
Nine men who were participating in a medical experiment recorded the
following decreases in their blood cholesterol levels. (See
Table 2.) These numbers have
mean , where
the standard error 10.13 is calculated from the time-honored formula
But
what if we are interested in an estimate other than the mean, for
example the median, equal to 32.50 for this data set. There is no
standard error formula for the median. The bootstrap estimates the
standard error of the median by repeatedly drawing ``bootstrap
samples" from the original data, reevaluating the median for each
bootstrap sample, and estimating the standard error of the original
median by the observed variability in the bootstrap medians. A
bootstrap sample is a sample of the original size, 9 in this case,
drawn randomly but with replacement from the original data set. For
the cholesterol data, 400 bootstrap samples yielded 400 bootstrap
medians, giving a estimate of 13.59. Notice that the median seems to be a worse
estimate than the mean in this case, in the sense of having a bigger
standard error.
Try this
bootstrap calculation yourself!
A great deal of theoretical work, by many statisticians, has gone into
showing that the bootstrap algorithm works. In situations where there
exists a formula, like the case of the mean, the algorithm
produces nearly the same number as the formula. Whether or not a
formula exists the algorithm produces a number that has excellent
theoretical properties as an accuracy estimate. Finally, and most
importantly, the bootstrap's good theoretical properties carry over
into real statistical practice.
My current research interests center on computer-based statistical
methodology. I am trying to find computer algorithms that automate
the often intricate methods of classical statistical inference. The
goal is both a more flexible theory, and a better understanding of the
classical methods.
top of page
Confessions of a Scientific Dilettante (name withheld on
request)


I can remember the difficulty I had choosing an undergraduate
major and the absolute horror of selecting a single subject to study
in graduate school. Although I still have occasional misgivings
about the inevitable degree of specialization demanded of contemporary
scientists, I have had very few serious regrets about my choice of a
career as an academic statistician. The principal reason is that I
can still dabble in a number of scientific subjects at varying degrees
of seriousness and spend a substantial part of my time thinking
about interesting mathematical problems. My work is largely
theoretical, and therefore I spend more effort thinking about
mathematics and computing than about applied science. Yet I
have spent a great deal of time in recent years pondering the proper
formulation of problems of sequential clinical trials and of
change-point problems in quality control and epidemiology. I also
like to watch from a distance the progress of stochastic models
in finance, change-point like problems in genetics, and some
stochastic models in physics and physical chemistry. Statistical
consulting presents another opportunity for intellectual variety and
perhaps financial remuneration, which I used to find necessary each
year at income tax time. The variety of problems is enormous,
and for each there are different approaches to a solution. The only
limits are my own speed at absorbing new ideas (usually not as fast as
I would like) and the twenty-four hours in each day.
An area of recurring interest for me is Sequential Analysis. The
foundation of this subject was laid by Abraham Wald during World War
II, when it was applied by the U.S. Military to problems of sampling
inspection. Today the applications motivating the greatest activity
are clinical trials involving human subjects, where ethical
considerations require careful monitoring of data to insure that
information about treatment effectiveness or unfavorable side effects
is discovered as soon as is reasonably possible. A subject which is
quite different conceptually but involves similar mathematical ideas
is Stochastic Control Theory, which may be applied to guidance systems
for ships or rockets or to the analysis of investment portfolios.
Of the problems I have worked on seriously, I believe the class of
Change-Point problems offers some of the greatest variety and
challenges. Originally these problems arose in quality control, where
one observes the output of a manufacturing process the quality of
which is subject to random variability and can be "in
control" or "out of control." The state of the process
can not be observed directly, but must be inferred by observations on
the quality of the output of the process. A change-point detection
scheme is one which observes the output sequentially and occasionally
signals that a process is out of control. The procedure must be
constrained to signal very rarely when the process is not out of
control, and subject to this constraint to detect as quickly as
possible a true lack of control. Similar problems involving
sequential detection of change-points arise in monitoring public
health records for the onset of an epidemic or an increase in the rate
of occurence of congenital abnormalities. Change-point problems also
arise in retrospective analysis of a variety of processes evolving
over time, e.g., in econometric time series. They lead naturally to
novel probability problems involving distribution of maxima of random
processes and random fields; and they involve interesting questions
close to the foundations of statistical inference: questions of
ancillarity and of the relations between Bayesian and likelihood
methods.
I am also interested in Probability Theory, especially random
walk, renewal theory, and Brownian motion, which play an important
role in both sequential analysis and change-point problems. I find the
probability theory arising in the context of these statistical
problems to be particularly interesting, and on numerous occasions
have succumbed to the fascination of probability theory without regard
to its relation to statistics.
top of page
 |