Wikisource:Scan Lab/Archives/2022-12

File:Oration Washington Ker.pdf

 * This section was archived on a request by: TE(æ)A,ea. (talk) 01:11, 7 December 2022 (UTC)

Please split the pages (all except first and last). PDF or DJVU is fine. TE(æ)A,ea. (talk) 20:43, 30 November 2022 (UTC)
 * File:Oration Delivered on the Centennial Day of Washingtons Initiation into Masonry (1852).djvu. Please check quality, add info template, export to Commons, and add any local speedy templates as appropriate. I had to split the pages manually (i.e, a Photoshop job) so my fat fingers may have messed up page order or similar. --Xover (talk) 09:21, 1 December 2022 (UTC)
 * Oh, and I just quickly eyeballed the title for the file name, so if you need it changed it's best to have it renamed here before transwiki to Commons. Believe it or not given how slow we sometimes are here, but Commons' backlog on admin tasks is just ridiculous. Xover (talk) 09:24, 1 December 2022 (UTC)
 * Xover: Thank you. Please add an apostrophe in “Washingtons” (“Washington’s”), but the title is fine otherwise. The pages are all correctly ordered, as well. I can’t export the file, but I do have rename rights at Commons, if you ever need such help. TE(æ)A,ea. (talk) 16:13, 1 December 2022 (UTC)
 * @TE(æ)A,ea.: Done: File:Oration Delivered on the Centennial Day of Washington's Initiation into Masonry (1852).djvu. You can't export it right now because it's missing the info and license templates (it's a sanity check so people don't export files with bad licensing or garbage info to Commons), but beyond such checks everyone should have access to the "Export to Commons" function (next to the Read, Edit, View History... tabs). I can't remember if there was a preference toggle for this, but I think it should be there by default. (Special:Import is restricted to accounts with the relevant permissions, but the "Export to Commons" thing is a separate function.) But in any case, if you run into trouble using it just drop a note here and I can do it. Xover (talk) 06:39, 2 December 2022 (UTC)
 * Xover: Thank you! I was able to export it, and it has now been proofread and transcluded. Would you mind deleting the original file? TE(æ)A,ea. (talk) 01:11, 7 December 2022 (UTC)
 * ✅ Xover (talk) 06:50, 7 December 2022 (UTC)

Atalanta Running

 * This section was archived on a request by: --Xover (talk) 10:21, 28 December 2022 (UTC)

Would someone kindly upload https://archive.org/details/mellon48atalanta to Commons? I tried using the IA upload tool but it got stuck somewhere.

I'm not 100% on the bibliographic information, but the Yale University Library is apparently satisfied to categorize it under "Alchemy--Early works to 1800", so it seems safe. As far as I can tell the licenses should be something like PD-old-100 plus PD-US-unpublished. Shells-shells (talk) 04:12, 26 October 2022 (UTC)


 * @Shells-shells: In light of Index:Atalanta running (ia mellon48atalanta).djvu, is this request still current? Xover (talk) 07:19, 1 December 2022 (UTC)
 * @Xover ✅ with IA-Upload. Sorry I forgot to update this request; feel free to archive. Shells-shells (talk) 08:13, 1 December 2022 (UTC)

TASJ
Could someone please download this and upload it as File:TASJ-1-1-2.pdf, please? It was published in 1884, and is thus PD in the US, but author’s lives and whatnot make this a local upload. TE(æ)A,ea. (talk) 01:11, 7 December 2022 (UTC)


 * @TE(æ)A,ea.: Is it important that it be PDF (vs. DjVu)? And would File:Transactions of the Asiatic Society of Japan, vol. 1-2 (1874).djvu be an acceptable name? Xover (talk) 18:46, 11 December 2022 (UTC)
 * Xover: The extra 1 is for series 1 (of five, currently). The request for PDF (and naming scheme) was because I was thinking of uploading the other volumes from Google Books myself (with the same scheme), and also because this volume needs some pages reordered (if you wouldn’t mind working on that after you get the file here). TE(æ)A,ea. (talk) 21:31, 11 December 2022 (UTC)
 * @TE(æ)A,ea.: Happy to work on it, but I simply don't have sane tools for working with PDFs so it would have to be DjVu in that case. As for file name, it's just a suggestion based on the principle that the file name should be descriptive. But how about File:Transactions of the Asiatic Society of Japan, series 1, vol. 1-2 (1964).djvu (to leave room for having both the 1874 original publication as well as this 1964 reprint, should that ever be relevant)? I have a DjVu ready and can upload it sometime late tomorrow if that's ok. Xover (talk) 22:57, 11 December 2022 (UTC)
 * Xover: If you’re better with DJVU, then that’s fine with me. “TASJ” is fine as a shortening. I don’t think 1964 should be used vs. 1874 because (1) the 1964 reprint is an exact reprint, without editing, and (2) there are actual volumes of TASJ first published in 1964. I just think that 1874 is clearer in this case. TE(æ)A,ea. (talk) 02:21, 12 December 2022 (UTC)
 * @TE(æ)A,ea.: Uploaded to File:TASJ-1-1-2.djvu (minimal quality control, bare bones info template). I uploaded it to Commons because their cutoff for just assuming PD even absent firm author info is pub. +120 years (so anything published before 1901 or thereabouts, currently). Xover (talk) 07:29, 12 December 2022 (UTC)
 * Xover: Thank you! I marked a few DJVU pages for deletion there, if you don’t mind deleting those pages and readjusting the pagelist. TE(æ)A,ea. (talk) 19:51, 12 December 2022 (UTC)
 * @TE(æ)A,ea.: ✅ From some other copies it looks like there is a fold-out map or illustration at /23 with a subsequent blank sheet, so I left those in place. Xover (talk) 08:16, 16 December 2022 (UTC)
 * Xover: I just got a scan of the images—there are four fold-out plates in a row. I don’t think it’s necessary to add those pages back, though, so I thank you for your work with this index. TE(æ)A,ea. (talk) 18:14, 16 December 2022 (UTC)
 * This section was archived on a request by: --Xover (talk) 17:13, 31 March 2023 (UTC)

Index:Thuvia, Maid of Mars.djvu
Is it possible to replace this djvu from google with this full-color version from IA. The publisher is different, but they appear to use the same plates for the text. Alternatively, the IA text can be uploaded as a separate file and the text transferred which is probably the better way. Languageseeker (talk) 15:45, 20 December 2022 (UTC)


 * @Languageseeker as a new index Index:Thuvia, Maid of Mars - Burroughs - 1920, Grosset and Dunlap.djvu as it's a different edition. I don't really see the point of transferring the proofread text, though, especially as the Grosset & Dunlap edition has rather fewer plates than the existing McClurg one. If you want to use the plates from the Grosset edition at the IA, they should be extracted from the upstream files at the IA anyway, never from the PDF/DJVU. Inductiveload— talk/contribs 16:41, 29 December 2022 (UTC)


 * @Inductiveload I didn't notice the differing number of plates. I think I'll use the available plates from the Grosset & Dunlap edition to replace the google ones in the McClurg scan and leave the Grosset edition for another occasion. Languageseeker (talk) 17:31, 29 December 2022 (UTC)
 * This section was archived on a request by: --Xover (talk) 20:15, 31 March 2023 (UTC)

Big Sur
At Hathi: https://babel.hathitrust.org/cgi/pt?id=uc1.31822033766825 It says copyright 1962 but it just went up at Librivox and I could access it. So, lets download it? --RaboKarbakian (talk) 16:16, 29 December 2022 (UTC)


 * @RaboKarbakian indeed this appears not to have have had copyright renewed when it should have been (around 1962 + 28 = 1990): Index:Big Sur - Kerouac - 1963.djvu. Inductiveload— talk/contribs 17:29, 29 December 2022 (UTC)


 * Inductiveload That was quick! I was coming here to discuss getting the pdf so I can remove the watermarks from it. Maybe we can still do that?--RaboKarbakian (talk) 17:50, 29 December 2022 (UTC)
 * @RaboKarbakian the watermarks are baked into the images. The PDF from Hathi is just a big, ordered, collection of JPG and PNGs and has watermarks in the images too. They're not causing a problem that justifies the effort of removing them page-by-page as far as I can tell. I can't download the whole book as a PDF, but I can provide the downloaded images from HT if that's what you're after? (edit now it's uploaded: in a ZIP here) Inductiveload— talk/contribs 18:02, 29 December 2022 (UTC)


 * Inductiveload HA! They aren't either what you said. You reminded me of mogrify....do you have evince installed? Or more importantly xpdf?  pdfimages takes Hathi pdf apart.  There is a flag to retain the page numbers and that should be toggled.  I have scripts that make running it easier.  Upload that original Hathi pdf and let me try getting rid of the watermarks before you dismiss me....--RaboKarbakian (talk) 21:22, 29 December 2022 (UTC)
 * @RaboKarbakian I can't upload the PDF because like I told you, I can't download the PDF of the entire book as non-institutional member, all I can get is page-by-page. However, you are right that the JPX in the page-wise PDF doesn't have the watermark burned in. However, I do not have tools to download those PDFs as they're not available on the HT Data API as far as I know, and I also don't have robust tools to process those PDFs, so still, all I can offer you is the images and you can feel free to edit them to remove the watermarks if you like. Inductiveload— talk/contribs 22:00, 29 December 2022 (UTC)
 * Inductiveload you downloaded all 200 images individually? You should download as pdfs individually.  Had I known you were manually downloading things, I would have done this myself.  Forgive me.  I will download the 200 pdf and get rid of the watermarks.  Shouldn't take more than a day, depending on the internet.--RaboKarbakian (talk) 23:33, 29 December 2022 (UTC)
 * I use the Data API, of course I don't download anything manually, that would complete madness! Batch downloading the book takes roughly 2.5 second per page plus a few more minutes for conversion and upload, but it's hands-off once I put the details in. Inductiveload— talk/contribs 23:38, 29 December 2022 (UTC)
 * Inductiveload or Xover https://drive.google.com/file/d/1isPrbKNqCcpNZ4wLgyBklDzZ9lcCtm53/view?usp=share_link is a xz file with the mixture of .jp2 and .pbm and https://drive.google.com/file/d/17whtK6g4Q0Tt58R2FsCFfJ389zZcUYWY/view?usp=sharing a zip archive of the watermarks that were stripped from the pdf. If you could use these files for the DJVU, that would be nice.
 * Inductiveload that link to the API was very interesting. I can see how you were confused about the watermarks being embedded or not.--RaboKarbakian (talk) 18:32, 30 December 2022 (UTC)
 * @RaboKarbakian: Is there anything left to do in this thread? Or can I just mark it resolved so it gets archived? Xover (talk) 07:37, 4 April 2023 (UTC)
 * Xover: My plan is to debug my djvu making script (one pesky line) and upload the djvu without the watermarks over the current djvu. This debugging is scheduled for later, however.  So, I will leave it to your good, and better than my, judgement about keeping or closing.  Scan lab should do what is best for the Scan lab.--RaboKarbakian (talk) 15:05, 4 April 2023 (UTC)
 * @RaboKarbakian: Well, the question was more "Do you need anything else from the Scan Lab in regards of this?", because if you don't then I'll close it out to get it off the books. If you do need something more then you'll have to specify. And I'm asking because it looks like the discussion kinda petered out without resolution, so it's kinda hard for me to tell (I haven't been keeping up the last few months due to IRL being recalcitrant). Did you want me to grab the above xz archive and build a DjVu of it? Xover (talk) 15:17, 4 April 2023 (UTC)
 * Xover: That would be great! This goal is "no watermarks". "My script working", a good goal, but a different goal for sure. Knowing your speed and dependability, I have closed this, being pretty sure it will be done before the software archives this. Also, I am sorry about this world that causes productive people to become recalcitrant.  I am sorry if I was a member of that part of this world.--RaboKarbakian (talk) 15:27, 4 April 2023 (UTC)
 * @RaboKarbakian: File:Big Sur (1963).djvu Xover (talk) 16:38, 4 April 2023 (UTC)

So, we are not to have a copy of this without watermarks? Just wondering because removing watermarks and ugly first pages, etc. Is something that I thought was preferred to the point of breaking upload software, even. If it is that you need a zipfile, you can ask.--RaboKarbakian (talk) 21:48, 9 February 2023 (UTC)
 * @Inductiveload: If you are using the HT Data API you should be able to request images without watermarks. Just add  to your querystring args when fetching  . Which image format are you downloading? The API documents ,   or   (which is either TIFF or JPEG 2000). The documentation mentions: "The   format requires higher authorization." I am assuming that means not all API keys are equal. The documentation also states: "A watermarked derivative is the default   resource in either   or   format derived from   or   archival images." This makes me think   either defaults to no watermark or there is no way to get watermarks for that format (which makes sense). It also makes me think you are not downloading   images (perhaps because you are not allowed to). —Uzume (talk) 08:24, 10 February 2023 (UTC)
 * Inductiveload and Uzume: Not being allowed to download makes sense. If that is the case, downloading each page as pdf (like I have been doing, limited by being an easily distracted human) so local software can dissemble and reassemble the desired pieces.  I actually prefer the individual pdf pages for this because there is more control over the page numbers with the software; well, that might be with "my software" part of the chain though.  About raw images from Hathi.  When I take them apart, so far, I never get any tif.  I get jp2, pbm and ppm and so far, all of the ppm are the spam.  The pbm can be either "bitonal" as they call it or "Grayscale" as everyone calls it.  The jp2 are either "Grayscale" or "sRGB". Information that I found helpful for the decision making for how to process the images for ocr reading using the gutenberg recipe.--RaboKarbakian (talk) 16:21, 10 February 2023 (UTC)
 * Inductiveload I had one other problem. My IM does not like 5000+px × 5000+px pbm.  My two choices to handle this would be to script it for my favorite gui-based image manipulating program, which has no problem with these images or find out if there is a setting for IM for cache size or buffering vent or send /tmp file to someplace a lot bigger.  The Railroad Gazette pbm are an example of this.--RaboKarbakian (talk) 17:24, 10 February 2023 (UTC)
 * @RaboKarbakian: If you are pulling PDFs apart and getting bitonal pbm/ppm with color jp2 images that is likely because the PDFs were constructed with image segmentation and mixed raster content for better compression (DjVu also uses such). The bitonals are used as layer masks to construct the output on top of a color background. —Uzume (talk) 19:14, 10 February 2023 (UTC)
 * Uzume: From what I have seen, these "bitonals" are simply scans of all text that were reduced to two colors. I have seen the commands for masking, etc., and I have seen that used in dissemblings from places other than Hathi, but (so far) all of Hathis pdf break down without masks or anything fancy.  I would never suggest a software to pdf dissemble and djvu create for any place other than Hathi, whose pdfs dissemble beautifully into working pieces.  sRGB/Grayscale jp2.  Grayscale/1 bit pbm.--RaboKarbakian (talk) 00:06, 11 February 2023 (UTC)
 * I am not sure I would say never that about any other place but I agree not all PDFs are created equal nor do even most use mixed raster content image segmentation (but most DjVus do; although DjVus use different image compression algorithms). —Uzume (talk) 16:45, 11 February 2023 (UTC)
 * This section was archived on a request by: RaboKarbakian (talk) 15:27, 4 April 2023 (UTC)

The Peeler
Could someone download “The Peeler” from The Partisan Review, volume 16, number 12, please? The story is one of few of hers in the public domain. TE(æ)A,ea. (talk) 21:32, 11 December 2022 (UTC)


 * @TE(æ)A,ea.: This item isn't downloadable from IA (it's a "Books to Borrow" scan). It's also not obvious that any part of this would be in the public domain. Xover (talk) 08:23, 16 December 2022 (UTC)
 * Xover: This issue of The Partisan Review was not renewed, unlike many other issues of the same periodical. In addition, this story was also not renewed. With an IA account, one can borrow the issue and download the appropriate pages. TE(æ)A,ea. (talk) 18:14, 16 December 2022 (UTC)
 * Even if you can borrow it there's no option to download anything. It's presumably technically possible to do so, but it'd require a lot of fiddling and custom coding. Since it doesn't appear anyone is up to that just now I'm tagging this for archiving to get it off the books. --Xover (talk) 06:07, 6 April 2023 (UTC)
 * This section was archived on a request by: --Xover (talk) 06:07, 6 April 2023 (UTC)

File:Oration Dedication.pdf
The first and second pages are from the binder, and can be removed. PDF pages 4–11 need to be split. The other two pages are fine. PDF or DJVU, your choice. File name, perhaps Oration Delivered on the Occasion of the Dedication of the New Hall of Cooper Lodge. TE(æ)A,ea. (talk) 21:32, 26 December 2022 (UTC)


 * @TE(æ)A,ea. Here it is c:File:Oration Delivered on the Occasion of the Dedication of the New Hall of Cooper Lodge.djvu Mpaa (talk) 23:27, 26 December 2022 (UTC)
 * Mpaa: Would you be able to keep the colour? TE(æ)A,ea. (talk) 17:48, 3 February 2023 (UTC)
 * @TE(æ)A,ea.: I've uploaded a new version in colour. Please check that the results are ok, and feel free to revert to the original version if needed. Xover (talk) 15:25, 1 April 2023 (UTC)
 * Absent feedback to the contrary I am going to assume this is ok and tag the request for archiving. --Xover (talk) 06:54, 6 April 2023 (UTC)
 * Xover: I just got to it now. The pages don’t show up correctly in Page: (e.g., p. 3 shows up as p. 5 of the old scan). TE(æ)A,ea. (talk) 16:30, 7 April 2023 (UTC)
 * @TE(æ)A,ea.: This was probably due to the issue mentioned at WS:S#Tech News: 2023-16. I uploaded a minimally modified version over it and the thumbnails seem to be ok now. Purging the file page sadly did not seem to work. Xover (talk) 13:08, 23 April 2023 (UTC)
 * Xover: See Oration Delivered on the Occasion of the Dedication of the New Hall of Cooper Lodge—thank you! TE(æ)A,ea. (talk) 03:33, 24 April 2023 (UTC)
 * This section was archived on a request by: TE(æ)A,ea. (talk) 03:33, 24 April 2023 (UTC)