Page:Biometrika - Volume 6, Issue 1.djvu/13

Rh Now 50 to 1 corresponds to three times the probable error in the normal curve and for most purposes would be considered significant; for this reason I have only tabled my curves for values of $$n$$ not greater than $$10$$, but have given the $$n=9$$ and $$n=10$$ tables to one further place of decimals. They can be used as foundations for finding values for larger samples.

The table for $$n=2$$ can be readily constructed by looking out $$\theta=\textrm{tan}^{-1}z$$ in Chambers’ Tables and then $$.5+\theta/\pi$$ gives the corresponding value.

Similarly $$\frac{1}{2}\textrm{sin}\theta+.5$$ gives the values when $$n=3$$.

There are two points of interest in the $$n=2$$ curve. Here $$s$$ is equal to half the distance between the two observations. $$\textrm{tan}^{-1}\frac{s}{s}=\frac{\pi}{4}$$ that between $$+s$$ and $$-s$$ lies $$2\times\frac{\pi}{4}\times\frac{1}{\pi}$$ or half the probability, i.e. if two observations have been made and we have no other information, it is an even chance that the mean of the (normal) population will lie between them. On the other hand the second moment coefficient is

or the standard deviation is infinite while the probable error is finite.

Before I had succeeded in solving my problem analytically, I had endeavoured to do so empirically. The material used was a correlation table containing the height and left middle finger measurements of 3000 criminals, from a paper by W. R. Macdonell (Biometrika, Vol. p. 219). The measurements were written out on 3000 pieces of cardboard, which were then very thoroughly shuffled and drawn at random. As each card was drawn its numbers were written down in a book which thus contains the measurements of 3000 criminals in a random order. Finally each consecutive set of 4 was taken as a sample—750 in all—and the mean, standard deviation, and correlation of each sample determined. The difference between the mean of each sample and the mean of the population was then divided by the standard deviation of the sample, giving us the $$z$$ of.

This provides us with two sets of 750 standard deviations and two sets of 750 $$z$$’s on which to test the theoretical results arrived at. The height and left middle finger correlation table was chosen because the distribution of both was approximately normal and the correlation was fairly high. Both frequency curves, however, deviate slightly from normality, the constants being for height $$\beta_1=.0026$$, $$\beta_2=3.175$$, and for left middle finger lengths $$\beta_1=.0030$$, $$\beta_2=3.140$$, and in consequence there is a tendency for a certain number of larger standard deviations to occur than if the distributions were normal. This, however, appears to make very little difference to the distribution of $$z$$.