Page:Popular Science Monthly Volume 66.djvu/379

Rh per cent, in history; 34.9 per cent, are given a grade below 50 in mathematics and only 19.1 per cent, in English. It is obvious that such grades should be standardized. It may be remarked incidentally that it is easy to select examiners by a competitive examination. If twenty candidates grade the same sets of papers, those whose grades are nearest the average of all the grades are likely to be the most competent examiners.

In these cases, and in all grades with which I am acquainted, there is a tendency to grade students above the average. Professor Pearson finds that in estimating the health of English boys, teachers place twice as many above 'normally healthy' as below, and he seems to regard it as gratifying that English boys should be more than normally healthy. We look on our own students as better than the average and in any case give them the benefit of the doubt. We call things 'fair' that are only average, and then the word 'fair' comes to mean average. Then we assign the grade 'fair' to students who are below the average, and a 'fair' student comes to mean a poor student. In assigning grades such words should be avoided; we should learn to think in terms of the average and probable error.

If grades are given on a centile system, the grade should mean the position of the man in his group; thus 60 should mean that in the long run it is more likely than anything else that there would be forty men better and fifty-nine not so good. The average probable error should be determined and a probable error should be attached to the grades; thus the grade 60 ± 10 means that the chances are even that there are between thirty and fifty men in the group who are better. The probable error becomes smaller as we depart from the average man; I estimate on the basis of a few experiments that it is over 10 in the middle of the scale. If this proves to be correct on the basis of more extended data, it is needless to grade more closely than on a scale of 10, though the first decimal would have some meaning when the grades are combined. If a hundred men are divided into ten groups of 10 each, the men in the middle groups will differ less from each other than those towards the ends, and if we wish to let the groups represent approximately equal ranges of merit, we can, as explained above, make five groups, A, B, C, D and F, putting 40 men in C, 20 men in both B and D and 10 in both A and F.

The determination of the validity of the grades given to college students and their standardization appear to me to be important because I regard it as desirable that students should be credited for the work they do rather than for the number of hours that they attend courses. By our present method a student who fails gets no credit at all, while a student who is nearly as bad (and perhaps worse)