Top

Site Menu
History of Statistics

Modern Statisticians

Francis Galton (1822-1911)

Francis Galton
Francis Galton (1822-1911)

First cousin of Charles Darwin, Francis Galton was related to many intellectuals in England as their parents often married others in the same social circle. He began to study medicine, but switched to studying math, continuing until his father passed away in 1844. Left with a fortune, he had no need to work. He quit working and studying to enjoy himself, but quickly realized his new lifestyle left him feeling unfulfilled, leading him to begin to travel and record his observations. He wrote about topics such as weather patterns and his belief in eminence, a vague positive characteristic that he believed was shared through genetics. In lieu of a clear definition of eminence, he set out to understand and substantiate this idea with physical evidence.

He created the Galton Biometrical Laboratory for collecting human measurements such as nose length, limb lengths, and features of fingerprints. Based on these measurements, he concluded that human measurements are normally distributed and even claimed that personality characteristics follow the same distribution. His studies led him to support a new field called eugenics, whose main idea is to improve the overall pool of human genetics through selective breeding. He advocated allowing only select people to have children. Many of Galton's philosophical ideas have since been proved erroneous. For example, success depends not only on talent or aptitudes inherited through genetics, but also through effort and chance.

In Galton's research, he often recruited families to have data to draw associations between measurements of family members. From his measurements centered on family relationships, he recognized a phenomenon known as regression toward the mean. This was exemplified in studying the heights of fathers and sons. He noticed that tall fathers tended to have tall sons, but not quite as tall as they were, and short fathers tended to have short sons, but not quite as short as they were. In other words, any extreme observation tended to be slightly less extreme in the next generation. To quantify the consistency of a linear relationship between two numeric variables, Galton developed the concept of the correlation coefficient. His work was later refined by Karl Pearson.

Interestingly, while mean is now used more often than the median to summarize one variable of a dataset, Galton was an advocate for using the median rather than the mean in addition to using the median absolute error rather than the standard error.