User:LarryGilbert/Draft guidelines for audio transcription

These are guidelines for formatting transcripts of audio or video recordings. Some of these come from professional experience in the field, but all are here mainly to promote consistency across all transcripts on the site.

The importance of each category of guidelines is given; "recommended" is of stronger importance than "suggested." Note that no guidelines are "required"—that's because if you have valid reason to bend or ignore any of these guidelines, you should feel free to do so (such is the tradition on Wikisource).

General
Importance: Suggested


 * Use the spelling appropriate to the dialect being spoken, e.g. "authorize" if being spoken by an American in the United States, "authorise" if being spoken by a Briton in Great Britain.

Time markers
Importance: Recommended


 * Until more specific markup or templates are developed, use sidenotes to indicate points in time as read from your audio or video player. Do this for every minute and half-minute with the right sidenote template, marking the time as "0m00s", "0m30s", "1m00s", etc.  (Suggestion: put the time mark immediately before the first word spoken at or immediately after that point in time.)


 * If the first audible portion of a recording starts more than a few seconds in, the first time mark may indicate the precise starting time to the second (e.g. "0m06s").


 * If there is an unusually long silence that extends past a minute or half-minute mark, the time mark(s) may be substituted with a single mark indicating the exact second at which the dialog begins again. (Also, it is suggested that a non-verbal notation be made about the silence.)

Names

 * Whenever a new person speaks:
 * Introduce their dialog with their name followed by a colon, in bold.
 * Use their full name the first time, their surname in the rest of the transcript (unless their full name is required to distinguish them from another speaker with a like name).
 * If you do not know the person's full name, use as much of it as you know.
 * If you do not know the person's name at all, you may use a generic term or name of a role that distinguishes them from other speakers. If they have no known role, "Male" or "Female" is acceptable. If you find yourself having to re-use such a generic identifier, distinguish them with numbers (e.g. "Host 1", "Female 2", etc.). Use "(unidentified)" in the first instance ("Male 1 (unidentified)").
 * When a person's name is spoken for the first time and you are not 100% certain of the spelling of their name, put [phonetic] after their name. This is unnecessary when repeated in the rest of the transcript.

Non-word utterances, stammers, etc.
Importance: Suggested


 * "Um" and "uh" should be included if they are plainly audible. Set these off as any interjection would be: "Wow, that's, uh, that caught me off balance." If they are not plainly audible or just barely audible, they may be omitted.
 * Use an em dash if a speaker stammers or changes thoughts in mid-sentence. ("I don—don't think that's—I'm not really sure that's the case.")
 * You may use [phonetic] if a speaker stammers and utters a sound that would not make sense as a word by itself. ("I think eh [phonetic]—the education system needs help.")
 * Use [inaudible] when a speaker mumbles and it is impossible to determine with certainty what is being said.

Non-verbal information
Importance: Suggested


 * It is not necessary to include non-verbal information (sounds, gestures, tones of voice, etc.) unless their inclusion is critical to the context of the events in the material being transcribed.
 * For example, the sound of something knocked over does not in itself need to be noted, but if a speaker calls attention to it ("Careful, there!"), a note is appropriate ([Sound of chair being knocked over.])
 * At the same time, do not draw inappropriate conclusions about what the sound signifies if it is not absolutely clear in the source. (In the above example, this may be more appropriate: [Sound of object being knocked over.])


 * If charts, graphs, or other visuals are shown in a video, include a description similar to what would be written to describe the visual for a sight-impaired person: Example: [A visual is shown of a man with the device hanging around his neck, projecting information onto the T-shirt of another man he is greeting.]


 * If the non-verbal information is directly related to a speaker, it should be placed in-line with the dialog for that person, e.g.: "I've come here today, [clears throat] excuse me, to discuss a very important matter." (Note: In this case, if the speaker did not say "excuse me," the [clears throat] note would be optional.)


 * If there is an occurrence not associated with an individual, the note should be on a line by itself, formatted like a sentence (with a capital letter and a period).