Wikimedia and the new collaborative digital archives

Posted on July 25, 2011 by textmessageguest For today’s post we are thrilled to open our blog space to NARA’s Wikipedian-in-Residence, Dominic McDevitt-Parks.

Everyone knows about Wikipedia (though there is certainly a lot of room for clarification of how it works in practice and why it is valuable for public history), so for this first post, I want to spotlight Wikisource, a lesser-known sister project of Wikipedia, which is a multilingual digital document repository where volunteers collaboratively transcribe, proofread, translate, and arrange primary source material.

Writing about transcription can be a bit dry, so I want to tell the story from the perspective of one of Wiksource’s newest documents. In 1829, the Cherokee Nation was at a crossroads. Settlers from Georgia had been steadily encroaching on their lands for years, despite past cessions, and the state government was becoming more strident in its demands for their removal from their traditional lands. The Cherokee, as a people and as a political society, were also in the midst of a great transformation. In under 10 years, the Cherokee Nation had produced its first written laws, a Supreme Court, a new capital, a printing office and newspaper, and, in 1827, a constitution establishing themselves as a republic and asserting their claim to the land. Despite their progress, the pressure did not abate. In 1828, gold was discovered on Cherokee land, and the Georgia Legislature passed a law that banned Cherokee self-government and extended state jurisdiction over all Cherokee citizens.

And so, in December 1829, the same month President Andrew Jackson proposed the Indian Removal Act to Congress, the leaders of the wrote this memorial letter to Congress. It is a vigorous defense of Cherokee independence and sovereignty over their land. “Permit us to ask,” they write, “what better right can a people have to a country, than the right of inheritance and immemorial peaceable possession?” While there is a clear native authorial voice, it may have some elements that would surprise most readers, as the writers appeal not just to the humanity of their adversaries, but the concepts of rights, property, and other legal principles. It also puts the situation for the Cherokees in stark terms: “Their existence and future happiness are at stake—divest them of their liberty and country, and you sink them in degredation,[sic] and put a check, if not a final stop, to their present progress in the arts of civilized life, and in the knowledge of the Christian religion.”

Congress passed the Indian Removal Act a few months later in 1830, and the Cherokee Nation took to the courts to plead their case. By the end of the decade, however, the disastrous Cherokee removal, part of the wider Trail of Tears, was complete.

This memorial letter has another history, though—its history as a physical document. In that capacity, it has often been at the forefront of technological development. In the early 1800s, Sequoyah, an illiterate Cherokee, independently developed the Cherokee writing system using only an English Bible he couldn’t read for guidance. The ingenious script is actually a syllabary, rather than an alphabet, consisting of 86 familiar-yet-unfamiliar-looking characters.

The system proved so popular that in a few short years, the Cherokee were largely literate in their own native language, and in 1825, the Cherokee Nation adopted it as their national writing system. In 1828, the Cherokee nation took the revolutionary step of ordering a printing press with a custom-made typeface in the Cherokee syllabary. In the new printing office in New Echota, the press became an important tool in promoting Cherokee identity and agitating against removal, acting as the official mouthpiece of the Cherokee Nation leadership. It published the first Native American newspaper, the Cherokee Phoenix, and first book, Constitution of the Cherokee Nation.

It was also, because of the Cherokee script, likely used to print this memorial letter to Congress in 1829, and certainly used to publish the reprint of the letter in the Cherokee Phoenix. While most of the signatures are written in Cherokee and would take an expert to decipher, one that is clearly written in English is “Eli Hicks,” brother-in-law of Chief John Ross, who would take over editorship of the paper in 1832 when the first editor fell out of favor with the Cherokee leadership for willingness to go along with a voluntary removal. The power of the printing press was apparently so potent that it lasted only a few years before it was forcibly confiscated and possibly destroyed by the Georgia authorities with the help of pro-removal Cherokees.

Submitted to Congress, the document entered the congressional record and eventually the National Archives’ holdings. Almost two centuries after its publication, this memorial letter was selected for inclusion in the National Archives’ Electronic Access Project (EAP). In 1997, NARA’s 10-year strategic plan placed a heavy emphasis on public access, especially through the web, and it began to implement the recommendations of a previous study related to electronic access. The NARA Archival Information Locator (NAIL) was developed as the first integrated online catalog and more than 100,000 exemplary documents were selected for digitization by subject-area specialists on the basis of criteria like “document[ing] the rights of American citizens” and “hav[ing] broad geographic, chronological, cultural, and topical appeal.” The “Memorial of the Cherokees” was digitized alongside Civil War battle maps, an original manuscript of Washington’s farewell address, the Tuskegee Syphilis Study, Vietnam War records, and the photos of Dorothea Lange, Ansel Adams, and Matthew Brady.

But this first foray into digitization was limited in scope, both in terms of the number and type of documents selected and the amount of access it provided. The EAP images (and they are only images, not sound or video) produced in the late 1990s are often small in size when compared to today’s standards, and when they were put online, they were scaled down even further.

Today, the National Archives is taking further steps to realize its goal fully, expressed at the outset of the EAP, to “bring records that had been available only to people who physically visited the National Archives to millions of people worldwide in libraries, in schools, and in their homes.” The EAP images, and other NARA documents, weren’t ever brought to anyone; they sit in a catalog waiting to be found and accessed. Outreach on blogs, Flickr, and other social media outlets has begun to change this state of affairs. And recently, the “Memorial of the Cherokees” became the first document selected for collaborative transcription by Wikisource as part of the National Archives’ new cooperative effort with Wikipedia (and its sister projects).

Wikisource, run on the same wiki platform as Wikipedia, has a number of features that make it a great tool for transcribing documents. By the time it is completed, each document’s transcription will have been proofread by one human and then validated by another. Transcriptions take place on a per-page basis, and each page is transcribed or viewed side-by-side with the corresponding image of that page. Pages’ current status are indicated by a color code at the top, and they are coordinated on index pages. You can see this process in action at the new Wikisource page for the Memorial of the Cherokees. Its English text was transcribed by an editor, and (as of writing) the page is marked with a yellow bar indicating that it still needs validation by an independent proofreader. And, excitingly, you can see that an editor has already begun the much more laborious task of transcribing the Cherokee text (you may need to download the font to view Cherokee script).

We have the many thousands of such digitized documents at our disposal, and there is already a steadily growing list of completed transcriptions. We even have plans for including links to these transcriptions from our online catalog so searchers can find them. Wikisource’s greatest asset is its small but dedicated community of volunteers. They have, for the past several years, built a project that is both technologically and socially ideal to house, transcribe, and view documents on the web in a collaborative environment. As part of the National Archives’ efforts to increase the accessibility and visibility of its holdings, we are offering our high-quality scans of documents to Wikisource for transcription. Wikisource needs more than just documents, though: it needs love. And this is where we each come in. Wikisource is only as good as we make it, all of us together. So if you would like to see it succeed with National Archives documents, please volunteer!

References:

Carlin, John W. “NARA’s Electronic Access Project.” Prologue 29, no. 3 (1997): 188-189.

Cushman, E. “The Cherokee Syllabary from Script to Print.” Ethnohistory 57, no. 4 (2010): 625.

Rozema, Vicki. Cherokee Voices: Early Accounts of Cherokee Life in the East. John F. Blair, Publisher, 2002.