Top

Site Menu
History of Statistics

Modern Statisticians

Karl Pearson (1857-1936)

Karl Pearson
Karl Pearson (1857-1936)

As a young man, Karl Pearson studied many things, including astronomy, meteorology, physics, mechanics, and biology and German literature and history. He became a socialist and rejected Christianity in his twenties. His first choice of college was Trinity College at Cambridge, but he failed the entrance exam. He was accepted at his second choice, King's College, and was offered a fellowship as he began his studies there. After completing his degree, he left England to study political science in Germany in the 1870s. Because of his admiration for Karl Marx, Pearson changed the spelling of his first name from Carl to Karl. He contributed to discussions of socialism, and Vladimir Lenin wrote highly of him. By the time he returned to England, he had written two books on political science. He created an unchaperoned Young Men's and Women's Discussion Club with an equal voice for everyone, an extremely unusual idea for the time.

He began teaching at University College London (UCL) and then worked for three years as a Gresham professor while still teaching at UCL. He delivered 12 lectures per year which were available to the public, and he published his first year of lectures in a book called The Grammar of Science. This book has since been translated into many languages and can be understood by the untrained student of statistics. During these lectures, he discussed the histogram, coining the term to reflect its use as a time-diagram. He also talked about what the word cause means which was important for later discussions about appropriate conclusions from statistical inference.

Before his time, many scientists believed that all natural data was normally distributed. Pearson recognized that there were other distributions that often better described data. He developed a new distribution, called the skew distribution. It required four parameters: the mean, standard deviation, symmetry, and kurtosis (which is a measure of how far rare measurements generally are from the mean). He incorrectly believed that this family of skew distributions described all possible probability distributions and that if you had enough data, you could find the true parameters of the distribution, not just estimates. He argued that the normal distribution was a member of this family where the symmetry and kurtosis parameters were both $0$.

Along with Walter Weldon (1860-1906) and Francis Galton (1822-1911), he established the journal Biometrika which was funded by Galton. Its purpose was to collect data to prove Darwin's Theory of Evolution. Even though it would be impossible to see enough generations to observe the emergence of a new species, they hoped to at least see a change in the distribution of characteristics of a species. After the collected data did little to prove Darwin's theory, datasets were still submitted to the journal and analyzed. Researchers would submit data they collected to be published. The people that submitted data generally only collected the most accessible data, which we now know is not necessarily representative of the entire population. However, Pearson assumed that as long as there was enough data, the sample would accurately represent the entire population. By 1911, both Weldon and Galton had died, leaving Pearson the sole editor of the journal, and as such, he allowed only what he chose to be published, including much of his own work. This fed his dispute with R.A. Fisher (1890-1962) After he died, his son Egon Pearson took over as editor and the focus of the journal transitioned to theoretical mathematics as it is today.

In his lifetime, Karl founded or took over four laboratories: the Drapers' Biometric Laboratory, funded by the Worshipful Drapers' Company; an Astronomical Laboratory, funded by the same company; the Galton Eugenics Laboratory; and the Anthropometric Laboratory. Some of his major contributions to statistics include developing the methods for finding the standard error of an estimator and the correlation coefficient the product moment correlation coefficient) commonly used today. He also found the exact chi-square distribution and created the chi-square goodness of fit test. He later extended this to construct the chi-square test of independence. He turned down both knighthood and the Royal Statistical Society Guy Medal believing that all medals and honours should be given to young men, they encourage them when they begin to doubt whether their work was of any value (Heyde & Seneta, 2001, p. 255).