Top

Site Menu
History of Statistics

Modern Statisticians

John Tukey (1915-2000)

John Tukey
John Tukey (1915-2000)

John Tukey was an only child, born and raised in New Bedford, Massachusetts. His parents were both high school teachers, and he was mostly taught at home. His parents focused his education on encouraging curiosity and helping him find his own answers to his questions. Tukey used the pattern of searching for answers to help him expand his knowledge throughout his life, which skill transferred well to the developments he made in statistics.

Tukey studied chemistry at Brown University for both undergraduate and master's degrees. He then moved to Princeton in 1937 to continue studies in chemistry, but later changed to mathematics. He graduated with a doctorate in pure mathematics in 1939. A Princeton faculty member named Sam Wilks encouraged him and other graduate students and young faculty to explore mathematical statistics. Tukey wrote his first paper in mathematical statistics in 1938 and wrote solely about mathematical statistics starting in 1944. In the 1960s, he began trying to answer how to determine whether two variables are associated and if their association shows causation.

Tukey popularized exploratory data analysis. This is the practice of looking at data summaries, especially graphical summaries, before conducting any inference. It is used to verify assumptions underlying the analyses and to identify unexpected values that may affect procedures. Exploratory data analysis also helps researchers draw quick, informal conclusions. Tukey also created versions of graphics that would be less misleading than commonly used statistical graphics. For example, he proposed adjusting histograms to plot the square root of frequency, and he created the box and whisker plot and popularized a version of the stem and leaf plot.

Tukey and others at Princeton developed a test for robustness—how well a statistical test works if the conditions are not met perfectly. His work on robustness helped researchers to know how far from ideal conditions their research could be before losing meaning. His conclusions about robustness relied on a Monte Carlo simulation using computers. Likely because of this work with computers, he also coined the term, bit for binary digit and the term software derived from the prevalent term, hardware.

Another of Tukey's major contributions to statistics is pairwise comparisons for an ANOVA test where the null hypothesis is rejected. He knew that a small p-value in the ANOVA test was evidence that the means of different levels of a categorical variable would not all be the same, but it would not show which levels had different means. He constructed the pairwise comparison test to find for which levels the means were not the same.

Since Tukey lived relatively recently, we have more insight into his personality than some other famous contributors. He liked to challenge commonly accepted ideas. For example, he proposed adjusting tally marks where first four dots are places in corners of a square, then the dots are connected to make a square and then the diagonals are crossed completing a set of $10$, arguing that this adjustment would enable someone to catch mistakes immediately. As a lecturer, he was comfortable with silence, waiting for the audience to participate before continuing (See First Impressions in McCullagh, 2003, pp. 541-542).

Tukey's Tally Marks
Tally marks proposed by Tukey

He said, It is better to have an approximate answer to the right question than an exact answer to the wrong one. (Salsburg, 2001, p. 231).