Index talk:A letter to the Rev. Richard Farmer.djvu

Words repeated between pages
''Taking this discussion here to keep it in one place and for the benefit of future editors. I will notify the below named editor on their discussion page shortly.''

cf. this.

This work has dangling words at the end of a large number of pages that are repeated at the start of the next (including opening quotation marks etc.). The dangling and repeated words sometimes occur right-aligned immediately following what is the natural last word on the first page, and sometimes after various pagination marks and even footnotes. This is done consistently enough that it is obviously a matter of a practice, and not a printer's mistake. Prosfilaes even goes so far as to say it is "standard in the time period" (which I cannot now verify, but have no reason to doubt).

Based on their changes and edit summaries I understand Prosfilaes's position to be (do please correct me if I have it wrong!) that since this is some sort of standard practice that contemporary readers would understand and ignore, we should alter the text to reflect what the printer intended rather than what they actually printed.

However, I feel that by deleting what is manifestly on the printed page we adulterate the work, both for the hypothetical contemporary readers and for modern ones. The contemporary reader would, if Prosfilaes's assertion that this was standard for the period is correct, expect the duplication to be there; while a modern reader would be deceived that this practice did not exist at the time. For anyone familiar with textual studies in the area of, say, Shakespeare (which this is an extended excursion on), there are literally thousands upon thousands of pages written, and countless hours of scholarship invested, on the minutest difference between printings, and between individual copies in a single printing, of a work (here, the First Folio and other editions of the plays). In fact, the very work in question here touches on Malone's collations of various editions of the plays in order to arrive at "" to the text as it previously had stood.

Thus I disagree with Prosfilaes's position, as I have understood it. Further, I feel it is in conflict with the various policy guidance I read before setting out on this project, that tended to emphasise that the Wikisource text should reflect the scanned source, warts and all, barring only details of visual presentation, including that of using rather than correct obvious spelling errors in the original.

@Prosfilaes: Could you elaborate on your reasoning, and perhaps point me at relevant policy guidance that supports your position? --Xover (talk) 10:36, 27 September 2015 (UTC)


 * Printers, not editors, printed the last word outside the text stream to aid in binding the book. It never appears in any version of that edition except for a facsimile reprint, which thankfully we can now do by exact copy instead retypesetting the original. At least signature marks (like "A2" at the bottom of the page) and the more advanced page numbers are useful in locating stuff in the book, but even then we don't just dump them into the text stream. This is specifically the last word on the page; in any environment that doesn't exactly reproduce the original pages, it goes.
 * There is a way to store this information in the footer part of the page, where it's not duplicated on the page. I don't have time to look it up right now, but the WS:Scriptorium can help you, or I'll look it up if I can find time later today. I'm personally of the opinion that we should be reproducing the content of the original, and let anyone who cares about the long-s, etc., look at the original, but the tools do try and cut the difference and let both of us coexist on the same wiki.
 * You can go to Google Books and put in anything and a date ending in, I think, around 1800 and you'll see that this was standard. As I've said, I have never seen anyone copy it into a modern copy, and especially not just dump it into the text stream.--Prosfilaes (talk) 16:53, 27 September 2015 (UTC)
 * Just a quick reply on one single point: disregard the specific technical mechanism I have chosen as I'm new to Wikisource (I spend most of my time on enwiki) and not familiar with the toolset and terminology yet. I have so far deliberately ignored the headers and footers—because I don't quite understand them yet, or how they're used, or what sorts of stuff can be done with them—with the intent to go back and make the necessary tweaks once I've done the grunt work on the text itself (but before doing whatever magic makes this work visible in mainspace). My argument above is about general principle, not specific implementation. If marking pages as "proofread" here carries the additional connotation that the page is now in its final form, never again to be touched, then I've misunderstood and should probably have been marking them as "proofread needed". --Xover (talk) 17:28, 27 September 2015 (UTC)
 * The header and the footer are artefacts of the publishing process, not of the author's intent. So when we transclude these to the main namespace, we wish to drop these components from our presentation of the work. So there are lots of components that come through the publishers work, chapter titles, page numbering, line and end of page hyphenation, word continuation, printer marks, etc. Similarly we want to convert page footnotes to chapter endnotes, such why we footnotes smallrefs in the footer. Our local means of replicating the publishers components will show through in the Page: namespace, however, when we are taking through our 21stC medium of the web, screen width, devices, ... we have to ditch those components for our rendering, as such the advice that you have been given is to look to our rendering of the work. So the reason why we take that care and diligence in the Page: ns is if we ever backfill the updated text to the image, that work is already done. — billinghurst  sDrewth  00:33, 28 September 2015 (UTC)

The ct-ligatures
Back again after all my available wiki-time got sucked into a black hole on another project. Pinging you both here rather than going to a user talk page because I imagine the discussion may be of use, or at least interest, for any future editors on this work.

Anyways, just before I disappeared in a puff of dust, Billighurst (SDrewthbot) removed all the ct templates from this work (didn't check if it was a run over everything that transcluded that template). I find that a bit concerning as it actually destroys information (albeit information that I gather that you two personally don't value). Checking the template's page it refers to a community consensus to deprecate it, but no links to a RfC or similar, and my search through the Scriptorium only turned up the opposite consensus: keep it for those that want it as it does no harm.

This raises two issues for me: 1) I'm struggling to find the sorts of framework for decisions that I'm used to (spoiled by?) on enwiki, that gives mostly clear policy guidance for questions like that raised in the thread above, and structured processes by which community consensus can be sought, fought, and subsequently documented; and 2) I believe the destruction of information (I use the term in a, uhm, "academic" sense, not to be dramatic) is inherently harmful, and absent evidence of equal or greater harm by preserving it (in the form of the template) I find it hard to quietly accept its removal.

As for 1); while my impression is that you two seem to be very active and central in the Wikisource community, and thus presumably have your "finger on the pulse" of the project, so to speak, the lack of a link to the specific community discussion that established the consensus you reference makes it hard to distinguish this from your own personal preference. That's not, I hasten to add, in any way intended to suggest that this is just your personal preference at work, but for me, as a new contributor to this project, it takes an active effort to recall that these acts are not as unilateral as they appear. This is similar to the principles at issue in the thread above: is the purpose, the goal, of Wikisource to reproduce the originals as faithfully as possible, or is it to modernise them as much as possible (the two are mutually exclusive; opposite extremes of the same scale). The practice is of course somewhere in between those two extremes, but the reigning principle is critical guidance for all the edge cases such as the repeated words at issue there. Absent a clear policy guidance on this (the purpose), finding the best compromise in implementation (stuffing the pagination marks in the footer vs. deleting them) is needlessly hard and subject to incidental factors (the venue, the debating efficacy of the participants, local optimums, etc.).

Regarding 2); I fail to see how keeping the template can do any harm, given it already had been modified to return just the plain latin letters "ct". For those that care about the ligatures the template could be modified to instead spit out the markup needed (just ) to enable native ligatures (and set the font to one that includes them) and styling it using the ligature facilities in the CSS3 Fonts module (see e.g. here). Disabling (or enabling, depending on the default) could then be done simply in a Widget, or even in common.css using an !important rule. Without the template in the pages this becomes impossible because we no longer have the information about where the ligatures were (and we certainly don't want to enable ligatures everywhere, only where they actually occur in the original work).

Anyways, I realise you're both busy and that getting through my wall of text here is a bit of a chore (sorry about that), and that the questions I raise are a bit more involved to address than something more concrete would be; so absent a conclusion to that I plan to get back to proofreading this work and including the ct template. I figure worst case it can be removed afterwards by bot and then at least its positions will be marked in the page history should it be needed at some point in the future, and meanwhile it costs little except my labor in adding it. Just so I don't give the impression I'm just obstinately insisting on it regardless of your guidance. --Xover (talk) 21:06, 27 October 2015 (UTC)


 * Anything that has to process the base wiki text has a harder time if ct is replaced with ct. You draw a false dichotomy; nobody is trying to reproduce this work as faithfully as possible (and any electronic edition is miles from that), nor are we trying to produce an edition modernized as much as possible. We are generally trying to produce a fairly standard level of edition, where the typographical and typesetting information is deleted and the textual information preserved.


 * I am unconvinced that preserving the ct-ligature in any format where you don't reproduce the original font is meaningful information, anymore then a list of people in New York whose first names are Donald. It is purely a font distinction, and far less interesting than the deleted hyphenation information, which has a place in the study of linguistics, and no more interesting then all the ligatures you ignore--sh, st, si, fi, etc. You and other people are focusing on what's unusual to you at the cost of any consistent level of encoding. In saying you only want to enable ligatures where they exist in the original text, you are producing a typographic monstrosity that does not fairly represent the original text. Enable ligatures that are appropriate for the font you are using, not the font that someone else used.--Prosfilaes (talk) 01:39, 28 October 2015 (UTC)


 * Add it if you want, though note that the next time I run my bot through for the ligature/... templates clean-up that it will be replaced. It is simply a typographic construct that reflects the period of the printing, not something that the author wrote for their work. The community has discussed this type of matter on multiple occasions that it presents no real value, and ultimately can create issues; and we come back to the same measure of "keep it simple" as per Style guide. — billinghurst  sDrewth  10:51, 28 October 2015 (UTC)
 * Also see Typographic_ligature — billinghurst  sDrewth  10:58, 28 October 2015 (UTC)

Hmm. If the entire theory of harm here distills down to bots having to deal with one more template then I remain unconvinced. Automated agents will have to deal with all sorts of funky stuff in any case, and something like will make so close to zero difference as to be effectively irrelevant.

As to your second point, just calling something a false dichotomy without actually supporting the assertion does not make it so. What you describe as the third alternative is really just an arbitrary point on the scale between the two extremes, and that is exactly the problem I'm trying to get across here. So long as there is not a guiding principle that points towards one of the extremes as the most desirable, but in practice unattainable, goal, all decisions in borderline cases will inherently be arbitrary. If the specific question comes up somewhere like this index talk page (just as an example), then it might be decided by the factor that you two are veterans on Wikisource and I'm a newbie, so maybe I don't want to be the brash and difficult newcomer that argues incessantly and so just backs down and drops the matter. The matter is settled not on its merits, but based on the edit counts of those involved. There are any number of more and less likely circumstances that will lead to individual issues going astray when there are no well-documented governing principles to evaluate an issue against. The "third option" above boils down to "I know it when I see it." and will vary wildly from person to person. Case in point, you and I appear to disagree by a significant margin on where to draw the line between replicating the source and optimising for the medium.

A list of people in New York whose first name is Donald has any number of uses. It would be entirely useless to me, right now, but I would be glad it existed in case I had a use for it at some point in the future. That "purely a font distinction" isn't interesting to you does not ipso facto mean that it isn't useful or interesting in general. And far from ignoring other ligatures I would like to preserve them all to the extent possible, it's just that the particular work we're discussing right now happens to just use the ct-ligature (and long s). I would also like to preserve hyphenation information and line breaks, both of which should be technically quite easy to do in general (I'm not sufficiently familiar with the software environment on Wikisource to be categoric about this as it pertains here). And as it happens, quite a lot of fonts support the ligatures that are relevant to this discussion, so you only have to enable them and you get them without *shock* *gasp* switching to a different font for just that ligature.

Note that at the same time I would like to produce a nice modern presentation that actually takes advantage of all the conveniences and improvements that half a millennium of advancements in type production has brought us, and that hides the obscure nitpicky details of several hundred years old printing technology for those that do not specifically care about them. Because the really nice thing about the tools we have now is that in many cases (far from all, but in many) we have the technical capability to do both at the same time.

I fear you may be mistaken. In this particular case, Edmond Malone saw to essentially every detail of the printing, as he did for all his works, including the 5 editions of The Life of Johnson that he saw through the press on behalf of James Boswell Sr.. We can't know, of course, whether he specifically engaged himself in the use of specific ligatures, but we know he was quite intimately involved (selecting paper, approving plates, etc.) in the process (which went on for years with some of his works); and we also know he was quite fussy about related issues such as bindings and cover designs (the quartos of Elizabethan and Jacobean plays he collected he had rebound in specially designed covers). The scanned page you see here is not how a printer's master proof looked prior to printing, it's how the finished work looked when sold to its readers. It is still, of course, a product of the printing process; but it is no less interesting as it carries information about the printing process.

You refer to previous discussions, but do not link to them, and to vague potential "issues", but do not specify them. You cite Style guide, which says things like "Page layout should mimic the original page layout …" and "Special characters such as accents and ligatures should be used wherever they appear in the original document, if reasonably easy to accomplish. This can be achieved by using … typography templates …" (that is, the exact opposite of the stance you intended to support with this reference). Don't get me wrong, I'm still operating on the assumption that you (both), as much more experienced editors on this project, have some sort of foundation for your claims that's more solid than "I don't like it"; but so long as neither of you actually point me at these foundations and articulate your arguments it's impossible for me to distinguish the two.

Take your warning "Add it if you want, though note that the next time I run my bot through for the ligature/... templates clean-up that it will be replaced." as an example. What if I (and I stress that this is of course just a hypothetical for the purpose of this discussion) were to throw together a bot that looked for diffs that removed a template and immediately reinserted it? The ability to write the necessary code to access the MediaWiki API does not in itself bestow validity to one's position. It can effectively "win" an argument by steamrollering the other side, but I presume it safe to assume you do not consider that any more of a valid tactic than I do.

I consider this work to have a long way to go before it's done. I've still not nearly figured out the tools, the style guide, and the practices on Wikisource, so I'm calculating with having to go over it again to fix several issues (and I hope there's some kind of peer review process here I can avail myself of when it's ready). Among them the changes related to headers and footers you made to the three pages you've validated, and the use of CSS in the header (in preference to inline in the page) that I've seen mentioned somewhere. When I say I intend to add the relevant ligature template to the remaining pages as I proofread them it's in that context: it's the most correct option based on my current understanding. But then I am in parallel pestering you two with this discussion in order to evolve my understanding. If you convince me your position is the correct one (either by persuasive argument, or by reference to policy, or to community consensus) I'll be happy to, and fully intend to, go back and fix whatever I have messed up in my ignorance (in this case, the ligature template). --Xover (talk) 18:54, 29 October 2015 (UTC)
 * If you want the discussions then they are usually within the archives of WS:S, though some will be within the archive of the talk pages, though the outcome of those is reflected in the style guide itself. You are correct that I haven't linked to the discussions (time poor, knowledge rich) and many of those discussions are quite old. The accents, ligatures, ... relates to the available and expected characters, not to stylistic ligatures which aren't actual characters. The only "stylistic" that we sometimes maintain is the "long ess" character and that is allowed, though not encouraged. Re replacement by bot, I run the bot through typographic templates which are there for ease of addition, not permanent display, hence it is a maintenance task in line with the displaying the character component. With regard to other changes, to which you referred during validation, I would have to look, it is not something that I remember. P.S. On a talk page for a work, I will cut to the chase and address the matters at hand in the time that I have available. If you want the philosophical and argumentative/debated, then let us take the conversation to WS:S where the community can see, discuss, clarify, and maybe update the guidance. — billinghurst  sDrewth  23:12, 29 October 2015 (UTC)
 * Fair enough. Thanks for your reply. --Xover (talk) 10:35, 31 October 2015 (UTC)
 * It is not good to drink no water, but to drink too much water is a quicker way to death. In that case, like many others, it is not that either extreme is desirable but unachievable; rather that neither extreme is good. The fact that you believe that the ct ligature is the only one found in this document, which the sh ligature is quite obvious, is part of my problem.
 * As per billinghurst, the theory is better discussed on WS:S.--Prosfilaes (talk) 02:27, 31 October 2015 (UTC)
 * If I missed any ligatures (or other features that might reasonably be preserved), I would very much appreciate a headsup. No sooner had I posted that than I found an instance of ffi, and it is entirely possible that I've overlooked others. --Xover (talk) 10:35, 31 October 2015 (UTC)

Multi-page footnotes
Incidentally, if anyone has any idea how to deal with the insanely humongous footnote that begins on page 31 and continues on for 4 sodding pages(!), I'd be eternally grateful. My only idea so far is to concatenate it onto the first page, and that just goes against the grain of every instinct for me. The alternative I can see is to not use ref-tags for it, but that has its own problems (so not really considering that an option). --Xover (talk) 20:24, 29 October 2015 (UTC)
 * Aha! But of course this is actually a solved problem already! Apparently, you can use  and   for this. Lovely wonderful clever coders here: I was expecting this to be a problem with no good solution and only distasteful workarounds. I'll have to track down Tpt or whoever added this for profligate thanks! --Xover (talk) 10:19, 1 November 2015 (UTC)
 * It was ThomasV and back about 2009, hoping that you found Help:Footnotes and endnotes. Four pages is at the top end, however, there has been worse. — billinghurst  sDrewth  10:27, 1 November 2015 (UTC)