Page:EPIC Oxford report.pdf/51

 articles on science and social science. For instance, with respect to the Wikipedia article on Neuronia, Reviewer 1 commented approvingly about the presence of images ("It includes more photographs. These photographs help to understand the role of neuron"). This is a factor that helped to distinguish two articles judged to be generally of commensurate quality. In general, though, imagery was not frequently referenced in comparative comments, perhaps because images had been dislocated from the flow of the text.

It is worth noting that a general audience (rather than academic reviewers, who may well be more used to engaging with largely text-based material) might expect high quality imagery from an encyclopaedia, online or otherwise, and might make overall judgments of quality based on layout and the quality of the imagery included. The methodology of this study (removing images from their original location and using academic reviewers) meant that the focus was very strongly on the words used and may not have fully captured this area of judgment.

The greatest difficulty in anonymising articles involved the removal of information that may be considered integral to the value of the article, such as the Wikipedia article tree, or the name of the author of some articles in other encyclopaedias. The removal of such information was clearly essential in order to achieve the goal of blind reviewing, but it could be argued that information about authorship might, to some extent, compensate for lack of references (although there is no inherent reason why named authorship need preclude the use of references).

The review process appeared to have been productive and appropriate. The criteria contained within the feedback tool (whose development is described in Section 3) provided an appropriate range of distinct perspectives on articles and stimulated a range of judgments and comments that, for the most part, enabled us to gain quite a rich and insightful range of comments about articles from reviewers. However, in developing this tool further, we would recommend a further period of trialling of criteria, especially around concepts such as completeness, conciseness and coherence, which sometimes seemed to generate slightly contradictory comments from some reviewers.

There is, of course, a fundamental problem in trying to reconcile the provision of clear and consistent criteria so that a wide range of reviewers can be seen to be making comparable judgments, with the need (especially when it comes to asking for qualitative judgments) to capture the language and criteria that academic experts might otherwise have used, if simply asked to discuss the strengths and weaknesses of articles as they perceived them. It is certainly only through such an approach that it would be possible to carry out any systematic form of quantifiable content analysis of experts' qualitative judgments. As it was, in analysing the qualitative aspect of reviewers' judgments, analysts had to make their own judgments to some extent about whether, for instance, a reviewer talked about enjoyment of an article because that term had been put before them in the feedback tool or because they had actually enjoyed reading the article.

This said, we felt that the qualitative comments provided considerable insight into the kinds of criteria and standards for making judgments about online encyclopaedia content that different academics use, and it may have been the case that we would have had considerably more difficulty in generating the quality of thinking and comment that we did receive without the framework that was provided. It is fair to say, though, that this pilot has 51