William Sealy Gosset (1876-1937)
![William Sealy Gosset](https://upload.wikimedia.org/wikipedia/commons/thumb/4/42/William_Sealy_Gosset.jpg/256px-William_Sealy_Gosset.jpg)
William Gosset graduated from Oxford at the age of 23 with a degree in chemistry and mathematics. At a time when companies had recently begun hiring mathematicians, he was immediately hired by Guinness Brewery. His job was to ensure consistency in beer brewing. The yeast used for fermentation was cultured in jars, and there was some variability in the density of yeast organisms in each bottle of fluid. Accuracy was important to ensure complete fermentation without the bitter taste that comes from too much yeast. Because yeast is a living organism, not only the number of cells in each sample varied due to chance, but the number of cells in the bottle changed. Through his investigation, he found that the number of cells could be modeled using the Poisson distribution. This was a remarkable accomplishment because there was little naturally-occurring data following the Poisson distribution that had been discovered since the discovery of the distribution itself.
The prevalent idea at the time was that a researcher needed a very large sample size to get a good estimate of a population parameter. However, Gosset realized that in many cases, a large sample size was not practical. He analytically found a distribution that reflected the distribution of sample means when the sample size is very small. He painstakingly checked his work by randomly sampling 750 samples of size four from the heights and left middle finger lengths of 3000 criminals. He used a chi-square goodness of fit test to compare the distribution he found analytically to the distribution of sample means he had collected. He found that the two distributions were similar enough to assume that the sample means followed the distribution he identified, which was called the t-distribution. His calculations were made on the assumption that the data came from a population with a normal distribution, but later research determined that the same distribution could come from other, less normal, distributions, meaning that the significance test using the t-distribution is robust.
Gosset wanted to publish his findings, but Guinness had a
strict no publishing policy resulting from a previous employee
spilling trade secrets.
Karl Pearson was eager to publish
Gosset's findings in his journal, Biometrika, so they
decided to publish them under the pseudonym Student
. The
paper that discussed his t-distribution (often known as Student's
t-distribution
) was called The Probable Error of the
Mean
and was published in 1908. He later published more
work under the same name. The Guinness company eventually found
out about his publishing, but not for many years, and possibly
not until his sudden death. They could not complain about him
wasting time while getting paid since so much of his work
benefited the company and was discovered outside of work
hours.
Gosset became a middleman
between
R. A. Fisher and Pearson,
who had a history of not getting along. He began his friendship
with Fisher while Fisher was studying at Cambridge. Fisher
wrote a paper with the same findings as Gosset's 1908 work,
likely found independently of Gosset, so his tutor introduced them.
Gosset found a small error in Fisher's paper and Fisher responded
with two pages of complicated mathematics, including a proof using
multidimensional geometry. Gosset often complained that he did
not understand what Fisher wrote, but he remained lifelong
friends with Fisher.