Page:Biometrika - Volume 6, Issue 1.djvu/14

14 Another thing which interferes with the comparison is the comparatively large groups in which the observations occur. The heights are arranged in 1 inch groups, the standard deviation being only 2.54 inches: while the finger lengths were originally grouped in millimetres, but unfortunately I did not at the time see the importance of having a smaller unit, and condensed them into two millimetre groups, in terms of which the standard deviation is 2.74.

Several curious results follow from taking samples of 4 from material disposed in such wide groups. The following points may be noticed:

(1) The means only occur as multiples of .25.

(2) The standard deviations occur as the square roots of the following types of numbers $$n$$, $$n+.19$$, $$n+.25$$, $$n+.50$$, $$n+.69$$, $$2n+.75$$.

(3) A standard deviation belonging to one of these groups can only be associated with a mean of a particular kind; thus a standard deviation of $$\sqrt{2}$$ can only occur if the mean differs by a whole number from the group we take as origin, while $$\sqrt{1.69}$$ will only occur when the mean is at $$n\pm.25$$.

(4) All the four individuals of the sample will occasionally come from the same group, giving a zero value for the standard deviation. Now this leads to an infinite value of $$z$$ and is clearly due to too wide a grouping, for although two men may have the same height when measured by inches, yet the finer the measurements the more seldom will they be identical, till finally the chance that four men will have exactly the same height is infinitely small. If we had smaller grouping the zero values of the standard deviation might be expected to increase, and a similar consideration will show that the smaller values of the standard deviation would also be likely to increase, such as .436, when 3 fall in one group and 1 in an adjacent group, or .50 when 2 fall in two adjacent groups. On the other hand when the individuals of the sample lie far apart, the argument of Sheppard’s correction will apply, the real value of the standard deviation being more likely to be smaller than that found owing to the frequency in any group being greater on the side nearer the mode.

These two effects of grouping will tend to neutralise each other in their effect on the mean value of the standard deviation, but both will increase the variability.

Accordingly we find that the mean value of the standard deviation is quite close to that calculated, while in each case the variability is sensibly greater. The fit of the curve is not good, both for this reason and because the frequency is not evenly distributed owing to effects (2) and (3) of grouping. On the other hand the fit of the curve giving the frequency of $$z$$ is very good and as that is the only practical point the comparison may be considered satisfactory.

The following are the figures for height:—