User talk:Tpt

Welcome
Welcome

Hello, and welcome to Wikisource! Thank you for. I hope you like the place and decide to stay. Here are a few good links for newcomers:
 * Help pages
 * Style guide
 * Inclusion policy
 * For Wikipedians

You may be interested in participating in Add the code active projects, PotM or CotW to your page for current wikisource projects.
 * Proofread of the Month
 * Community collaboration
 * Requested texts

You can put a brief description of your interests on your user page and contributions to another Wikimedia project, such as Wikipedia and Commons.

I hope you enjoy contributing to Wikisource, the library that is free for everyone to use! In discussions, please "sign" your comments using four tildes ( ~ ); this will automatically produce your IP address (or username if you're logged in) and the date. If you need help, ask me on my talk page, or ask your question here (click  [ edit] ) and place  before your question.

Again, welcome! Beeswaxcandle (talk) 23:48, 3 September 2011 (UTC)

User talk:BirgitteSB
More questions for you over there. And please tell me if ever I should move the disscussion to BugzillaBirgitte SB  23:44, 18 July 2012 (UTC)

Scriptorium
As nobody else is stepping in, could I trouble you to start this page? --Piotrus (talk) 20:49, 20 September 2012 (UTC)
 * I've created a beginning of list : oldwikisource:Wikisource:Metadata. Tpt (talk) 19:09, 21 September 2012 (UTC)

Section transcluded when it shouldn't be
Hi Tpt. Wondering whether you can explain where the "Dudley, Lettice" section is included, though it is not within the transclusion component and is wrapped with its own tag. Thanks. — billinghurst  sDrewth  13:48, 1 October 2012 (UTC)

slurpInterwiki.js sur wikidata
Bonjour,

La communauté catalane a traduit ton script en catalan, et voudrait que tu l'inclusses dans ton code source. Le code se trouve sur ta page de discussion sur Wikidata. Merci d'avance.--Bertrand GRONDIN (talk) 20:08, 3 November 2012 (UTC)

'Magical Headers' and footnotes
Hello. I was wondering if there was a way to tweak your header template so that footnotes appear above the blue navigation bar at the bottom of Mainspace pages (e.g.) as opposed to underneath the bar. Thanks, Londonjackbooks (talk) 22:11, 25 November 2012 (UTC)
 * This bar is not added by the header template but by a script. I'll try to improve it. Tpt (talk) 17:07, 26 November 2012 (UTC)
 * Thanks. Londonjackbooks (talk) 17:21, 26 November 2012 (UTC)

Header template and Plain sister parameters
Is it possible to incorporate Plain sister entries into the header template so that links to sister projects appear in the notes section of the header? Ex., , etc. If I am not explaining myself well, please reference the Scriptorium for elaboration. Thanks again, Londonjackbooks (talk) 13:40, 8 March 2013 (UTC)

Typo in About page of epub export
Gday Tpt. I haven't seen you in IRC to ask, so resorting to a note on your talk page. I spotted a typo in the 'About' text that appends an exported Epub. Where are corrections made? On wiki or is that something that happens in production? Anyway, the typo is in the library in first sentence This e-book comes from the online libray Wikisource Thanks for all that you do. — billinghurst  sDrewth  13:23, 7 December 2012 (UTC)
 * The about text is stored in MediaWiki:Wsexport about in each wiki. The default value is the page in oldwikisource oldwikisource:MediaWiki:Wsexport about. After any change in the file you have to purge the wsexport cache by navigating to http://wsexport.wmflabs.org/tool/book.php?refresh=true&lang=YOUR_LANGUAGE . I've fixed the oldwikisource version. Tpt (talk) 16:39, 7 December 2012 (UTC)

JSON in MediaWiki pages
Hi. I heard you were poking at the Proofread Page extension. It occurs to me that the JSON in MediaWiki pages output could be (dramatically) enhanced by some new code from Meta-Wiki. Example here: m:Schema:Proofread Page. --MZMcBride (talk) 01:48, 16 December 2012 (UTC)
 * This new code is currently included into the EventLogging MediaWiki extension. I'll talk with its developers if we could include this output system inside of Core in order to can be used by all extensions. Tpt (talk) 07:41, 16 December 2012 (UTC)

Sending EPUB down the subpages hierarchy
Just had a play with Mrs. Caudle's curtain lectures and found that it only generated chapter/pages that were listed on the front page, not those that were listed subsidiary to work itself. Noting that the "The Curtain Lectures" component is 206pp and there are 30+ chapters that are listed on a separate page due to the limitations on the published table of contents. Is there a means to ensure that it drills down further? Or are we going to have to more cleverly construct our tables of content to get the wsexport tool to work? Thanks. Apologies for these possibly stupid questions, as I don't know where to look for the construction logic, and we are yet to document these sorts of things. — billinghurst  sDrewth  12:36, 16 December 2012 (UTC)
 * You just have to add a "class="ws-summary"" to the node that contain the list of the sub-chapters as I have done here. These things are documented on oldwikisource. Tpt (talk)

Microformat data
Your attention at Template talk:Header would be appreciated, please. Pigsonthewing (talk) 23:52, 16 January 2013 (UTC)

Some difficulties with ws-export
Hi Tpt! Thanks for your ws-export extension, we have deployed it recently at ws-ca and it works really well. Good job! I found only some minor problems, which are: (1) with the side notes like here. Is there anything that I'm doing wrong? (2) I also noticed that the collaborator list reflects only the people who have edited the transcluded pages, but not the ones that have edited the pages. (3) I've seen that in ws-fr you manage to include the book cover in the epub, how can we make it happen at ws-ca? Thanks again for your efforts, this extension was extremely needed! --Micru (talk) 15:36, 20 January 2013 (UTC)
 * 1 It's maybe because the epub style-sheet doesn't include the sidenote-right class. To add it in epubs, add it to ca:Mediawiki:Epub.css and reload Wsexport cache.
 * 2 These people should be included in the list. I'll report this issue to the contributor that maintain this feature.
 * 3 Add something like Test.djvu/3 or Test.jpg  to the html of the main page of the book (this can be done by a template).
 * Tpt (talk) 20:15, 20 January 2013 (UTC)
 * Thanks for your answers! Still another question, what is the best way to define a nested "ws-summary"? And is it possible to specify the order or a different title other than the link name? --Micru (talk) 14:49, 21 January 2013 (UTC)
 * If a page linked by a link inside of the "ws-summary" of the main page contains a "ws-summary" containing links, the pages linked by this second "ws-summary" will also be added to the epub. This, I hope answer to your question that I'm don't sure to understand well.
 * No, it isn't currently possible to define an other title than the link name. Tpt (talk) 21:00, 21 January 2013 (UTC)
 * Yes, that answers my question, I thought it would be possible to define sub-sections on the main "ws-summary", but it seems it is only possible using linked pages with more links.--Micru (talk) 21:21, 21 January 2013 (UTC)

Subpages in Index/Page
WMF made changes to how they defined their global settings, and across the WS it broke Template: ns by taking away subpages across the WSes (now fixed), but it also introduced subpages to Index: and Page: (well all ns 100 to 111 by default) For the coding in ProofreadPage, do you see any issues with having subpages in these two namespace? When I was talking to Reedy about fixes, he mentioned whether we should be standardising the namespace numbers for the WSes, and batch move all the pages to a higher number. I ran away, but Reedy sees value in adding it to the TO DO list, as least for consideration. Thanks for your advice in this area. — billinghurst  sDrewth  01:50, 5 February 2013 (UTC)

A Critical Examination of Dr G. Birkbeck Hills "Johnsonian" Editions doesn't epub
Gday Tpt. I have found that quotes in work names in main ns and file/index/page ns are problematic. A Critical Examination of Dr G. Birkbeck Hills "Johnsonian" Editions doen't feed into Epub via the sidebar, and it also causes problems in page numbering that ThomasV set up (did a note to WS:S about that bit. In the end, I may end up moving all these components to a naming without quotations,I thought it best to flag and see if there is a programmatic fix.
 * http://wsexport.wmflabs.org/tool/book.php?lang=en&page=A_Critical_Examination_of_Dr_G._Birkbeck_Hills_%22Johnsonian%22_Editions&format=epub-2&fonts=

If any of this is something that is better in Bugzilla, please let me know. Thanks. — billinghurst  sDrewth  02:24, 22 April 2013 (UTC)
 * It's now fixed. If you want you can report bugs in GitHub that also host source code. Tpt (talk) 19:19, 30 April 2013 (UTC)

iaUploadBot
Hi, I can not upload many files from Archive.org like https://archive.org/details/NaderNimaiBengali, There are no .djvu file. https://toolserver.org/~tpt/iaUploadBot/step1.php

Jayantanth (talk) 10:05, 23 November 2013 (UTC)

fr:Discussion_utilisateur:Tpt
Courtsey notification, locally. ShakespeareFan00 (talk) 16:44, 29 May 2014 (UTC)

hidden zero padding
Hi Thomas,

Regarding PagelistTagParser.php, lines 66–71, can we please suppress this hidden zero padding for non-numeric page labels? The extra padding looks horrible when the page label is not a number; e.g. Index:The Bostonians (London & New York, Macmillan & Co., 1886).djvu. Hesperian 10:39, 18 June 2014 (UTC)
 * +1 — billinghurst  sDrewth  10:50, 18 June 2014 (UTC)
 * ✅. Fixed by this change that should be deployed next Tuesday. Tpt (talk) 14:31, 18 June 2014 (UTC)
 * Great, thanks! Hesperian 05:40, 19 June 2014 (UTC)

Thanx
Hi Thomas,

Thank you very much for fixing the bug in the program that creates the Epub-files, yesterday. You did it so fast! Wonderful. Just to let you know: it works fine now! I did a new download of the book I mentioned and now I can read all of it on my E-reader. Thanks again for that.

What I was thinking: did this little bug affect all export to Epubs? And what does it mean that no one did discover this bug until now?

Greetings, Dick Bos (talk) 13:05, 10 August 2014 (UTC)
 * "did this little bug affect all export to Epubs?": I believe it affects all ePub containing images after the deployment of the MediaViewer (some weeks ago). But I believe that most ePub readers do not fail like yours so maybe no other Wikisource contributor has been affected by this bug. Tpt (talk) 17:14, 12 August 2014 (UTC)

ia-upload failure ...
Hi, I tried to upload 57MB and received the error The upload to WikimediaCommons failed: Client error response [status code] 413 [reason phrase] Request Entity Too Large [url] https://commons.wikimedia.org/w/api.php?action=upload&filename=Old_and_New_London%2C_vol._1.djvu&format=json

== ==

which was for a user. There will also be a second volume to follow. If you can see that I have done something wrong, or if there is a tweak that you need to make, then please let me know when I am right to go. Thanks. — billinghurst  sDrewth  05:36, 4 January 2015 (UTC)
 * Sorry for the late answer, I was very busy last week. The root cause of this error is, I think, that the DjVu is too big for Wikimedia Commons upload API. It's very strange, the DjVu is around 57MB and I believe that the upload limit is of 100MB. So, I think that the only way to upload this file to Wikimedia Commons is to use the UploadWizard. Tpt (talk) 09:52, 10 January 2015 (UTC)


 * Not an issue, hardly urgent. That was my thought too, though couldn't be sure. I have created a bug for this.  — billinghurst  sDrewth  12:10, 10 January 2015 (UTC)

Proofread extension logic
Hi Tpt. I have written a class in pywikibot to handle a ProofreadPage (it is in proofread.py). I noticed I cannot change independently status and user in the header (e.g. I tried to just change the status from validated to not_proofread but API edit action gave me an error). Where I can find what consistency checks are done to declare that a page is valid and can be saved? Could you point me to some code or description? So I can put the same restrictions in pywikibot or give some explanatory warnings. TIA Mpaa (talk) 20:40, 4 June 2015 (UTC)
 * Hi! Tank you very much! It is a very nice idea. Some points:
 * You can request and save Page: pages content in a JSON format where the header/footer and body are slpit. So you would avoid to have some bad splitter code. It is a feature that I haven't advertised much because I was not sure to keep it in the long term but I think now it will stay for a long time. To get a page you just have to use the rvcontentformat parameter set to application/json as done here : https://en.wikisource.org/w/api.php?action=query&prop=revisions&titles=Page:Mrs_Beeton%27s_Book_of_Household_Management_%28Part_2%29.djvu/61&rvprop=timestamp|user|comment|content&rvcontentformat=application/json and to save it just use action=edit with contentformat=application/json.
 * About level change the rules are:
 * If the level changes you should set the user to be your bot.
 * If the level does not change you should keep the existing user.
 * If the level change to 3 to 4 the user should change (e.g. the person that set the level to 4 should be different than the one that set it to 3).
 * The code: https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FProofreadPage.git/ceed34484b48049e2b7d3f9370b05caac486646e/ProofreadPage.body.php#L604 In fact I think that ProofreadPage should do the work of changing the user nameinstead of requesting the API user to do it as it is currently done with the web editing interface. I have just opened T101461 about it. Tpt (talk) 23:20, 4 June 2015 (UTC)
 * Tpt (talk) 23:20, 4 June 2015 (UTC)
 * Thanks a lot.— Mpaa (talk) 16:01, 5 June 2015 (UTC)
 * Hi. I am try to use your suggestion on json format, but I have an issue which does not allow to simplify my code.
 * Would it be possible to have "preload" to use contentformat application/json choice for pages with contentmodel: proofread-page? E.g. get the json format in https://en.wikisource.org/w/api.php?action=query&prop=info&titles=Page:Philosophical_Review_Volume_22.djvu/64&inprop=preload.
 * I use preload a lot as it is very convenient to get djvu text for not-existing pages. Problem is that if I get json for existing pages but not for preloaded pages, I cannot get rid of the inconvenient/error-prone regex logic, as I need to 'parse' wikitext, in order to split in header/body/footer, when preloading.— Mpaa (talk) 08:36, 6 June 2015 (UTC)
 * I don't think it is currently possible :-(. You could open a bug to mediawiki core (if it is not already done) to request the possibility to require a specific content format for prop=info&action=preload. It should not be too hard to implement on the MediaWiki side as what is currently done by this API is a serialization of the internal representation in the default content format. Tpt (talk) 15:21, 6 June 2015 (UTC)
 * Thanks, just FYI, T101622 — Mpaa (talk) 21:16, 6 June 2015 (UTC)

hocr error for lang='ru'
Hi there! I'm trying to use the hocr service in the Russian Wikisource (calling it with lang='ru' from a custom js in the meantime) and it returns "unable to locate file /data/project/phetools/cache/hocr/<...>.hocr for page <...>.djvu lang ru" to the callback for a page from its djvu. What can be the reason? hocr should not depend on the language since it only extracts the text layer from djvu, should it? Thanks in advance, Hinote (talk) 21:00, 25 August 2015 (UTC) P.S. Job state is 'success' in the hocr queue after being 'pending' for long time... Anyway, the hocr of a page of that djvu still fails... I tried hocr for some pages of 3 Russian djvu files (indexes and pages in ru WS while the djvu files are in commons) -- for 1 is okay and 2 failed... Strange... Hinote (talk) 21:17, 25 August 2015 (UTC)

P.S. More specifically, it fails with "unable to locate ... .hocr ..." for pages in this index (jobid=9966 in the hocr queue), where it runs fast_hocr (I checked locally -- it runs fast_hocr with no problem, has_word_bbox returns True) and does not fall to slow_hocr, while it runs okay for pages in this index, where it changes text falling into slow_ocr... Hinote (talk) 05:04, 26 August 2015 (UTC)

Finally, I debugged locally into the tool logic and fast_hocr seems to be completely broken: djvu_text_to_hocr.do_parse produces files with names page_NNNN.html (and then page_NNNN.html.bz2 after compession) in the cache_path catalog, while get_hocr tries to fetch files with names which look like page_NNNN.hocr and page_NNNN.hocr.bz2 from the cache_path. So, hocr seems to work when it falls into slow_hocr method and changes text from the existing OCR layer to the text produced by tesseract in slow_hocr. However, if fast_hocr has already ran and placed its output files into the cache_path, it does not fall into slow_hocr and hocr returns nothing while get_hocr simply fails. Please verify and fix this misbehaviour. Hinote (talk) 14:01, 26 August 2015 (UTC)


 * Nevermind, Phe has already taken the issue... Hinote (talk) 00:01, 28 August 2015 (UTC)

Example for ws-cover usage
I could not locate any example for using ws-cover in Epub creation in French or English or by googling. This does not seem to be implemented in the header templates as well. Will using this replace the standard front page generated by tool? Can you help?--Arjunaraoc (talk) 00:38, 29 March 2016 (UTC)
 * It's done by the fr:module:Header template in fr.wikisource with the line . The ws-cover should be the name of a file and may be follow by /XX with XX the number of the displayed page in case of multiple pages files like in "foo.pdf/3". It adds an extra page at the front of the book with the image (and it's this image that is displayed as cover by ereaders). Tpt (talk) 17:20, 31 March 2016 (UTC)
 * Thanks Tpt for your response. As we are using old-style headers, I may need some time to try it. We have a nice set of cover pages for Telugu Ebooks. Can you implement the same on English Wikisource, so that I can adapt it more easily? --Arjunaraoc (talk) 23:32, 9 April 2016 (UTC)
 * @Tpt I tried an experimental header/sandboxarjuna of Header template and a sample ebook page using it,  User:Arjunaraoc/Sample_EPub_with_cover to try it, but the cover did not appear.Can you help debug this? --Arjunaraoc (talk) 06:20, 10 April 2016 (UTC)
 * @Tpt, I could see the cover in calibre, but moon+ ebook reader is not showing it with either svg or png. looks like some improvement may be required.,--Arjunaraoc (talk) 06:59, 10 April 2016 (UTC)

Bad gateway error when purging wsexport about cache
@User:Tpt Trying this http://wsexport.wmflabs.org/tool/book.php?refresh=true&lang=te gives bad gateway error.--Arjunaraoc (talk) 23:50, 9 April 2016 (UTC)

Proofreadpage status vs undoing
Hi. In a case like this, shouldn't the ProofreadPage status be reset? Bye.--Mpaa (talk) 17:55, 30 October 2016 (UTC)
 * Yes, definitely. I've just opened a task on Phabricator about it: T149531. Tpt (talk) 18:28, 30 October 2016 (UTC)

Problem with the IA file transfer
Hi. I posted this message on Mediawiki because I didn't know the creator of the script at that time. Could you please read it, and advise? Unable to transfer files from Internet Archive to the Commons using IA Upload Bot. . . Thanks. — Ineuw talk 04:01, 23 November 2016 (UTC)

Proofreadpage info in API?
Hi.

Is there a way to get the prp-page-image thumb_url for a given Page:xxx/y via API?

E.g: "//upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Schiller_Musenalmanach_1799_195.jpg/1024px-Schiller_Musenalmanach_1799_195.jpg" for de:Seite:Schiller_Musenalmanach_1799_195.jpg?



Thanks.— Mpaa (talk) 16:33, 21 May 2017 (UTC)ì
 * — Mpaa (talk) 19:12, 24 May 2017 (UTC)
 * There is no API for that sadly. What you could do is use the title of the Page: page to retrieve the file name (you just split on "/" to get the page number if it exists and you use the base title as the file title) and you call the API to retrieve a thumbnail URL. Tpt (talk) 19:37, 24 May 2017 (UTC)
 * OK, thanks.— Mpaa (talk) 20:11, 24 May 2017 (UTC)

rvcontentformat for Proofreadpage in API
Hi.

With reference to changes in API for rvcontentformat (see https://phabricator.wikimedia.org/T20095), will it be possible to read a proofreadpage with 'application/json' format?

Is it foreseen to have beside the main slot, also other slots readable as json? Thanks.— Mpaa (talk) 21:59, 5 February 2019 (UTC)
 * Hi! T20095 does not seem related to our question. Yes, it's already possible to read Page: pages content in a JSON+Wikitext format. It is a not stable feature and I'm not sure it'll be supported forever. Here is an example. Tpt (talk) 22:05, 5 February 2019 (UTC)
 * I lost one 5 ... here's the right one, sorry: T200955, they say: "The rvcontentformat parameter to action=query&prop=revisions has been deprecated". I knew it was possible, my question is if it still will be possible after the deprecation. Reading about slots, it seems one can define several slots for a page. Would it be possible to define a slot for ProofreadPage, beside the main slot, with format 'json'?— Mpaa (talk) 20:11, 6 February 2019 (UTC)
 * Hi! Sorry for the misunderstanding. Using a slot for the JSON representation is imho a bad idea because it would mean to duplicate the content in the storage system and override a bunch of MediaWiki logic to make sure that the wikitext and JSON representations say in sync. MediaWiki assumes that slots could be edited separately so, e.g. we would have to override the editing API and other parts to make sure to edit the other slot at the same time. Writing a wikitext splitting library for the most used programming languages is probably much easier and more maintainable. Tpt (talk) 21:30, 6 February 2019 (UTC)
 * OK, thanks.— Mpaa (talk) 21:37, 6 February 2019 (UTC)

T224355
Hello. Please take a look, not a duplicate? Ratte (talk) 00:49, 26 May 2019 (UTC)

ProofreadPage and impact of tweaking the template
Will we destroy the server if we add a  to the span in MediaWiki:Proofreadpage pagenum template, or is that a reasonable thing to try and then revert if things break elsewhere? cf. this discussion of various browser bugs tickled by empty inline elements (and the linked thread links to previous threads delving into the issues caused by empty inline elements which you might conceivably find interesting).

(Dropping an extra note on your user talk since it occurs to me you'd probably have pings disabled. Apologies if this is a duplicate: not intending to nag, and the question is in no way an urgent one.) --Xover (talk) 06:53, 1 June 2019 (UTC)
 * We do have a sandbox and components set up for running such tests in the Index: namespace. Well, we did, and I doubt that anyone has deleted them. — billinghurst  sDrewth  06:57, 23 August 2019 (UTC)
 * So far as I know, MediaWiki:Proofreadpage pagenum template has global effect and there is no way to apply such a change to only a subset of pages. It also would not tell us anything about the impact on server load when all transcluded pages have to be re-rendered to account for the changed separator.Incidentally, the experiment itself was a success, but it turns out MediaWiki:PageNumbers.js forcibly overwrites whatever is in the Proofread page template. Hence fixing the layout bugs will have to wait until we find some sane way to modify PageNumbers.js without breaking everything (it's unconditionally loaded from global js, and has overriding code scattered across multiple places, so there's no good way to work on a copy without colliding with the global version). I'm working—off and on—on various approaches to get to an end state where page numbers and dynamic layouts are, preferably independent, Gadgets that can be turned on or off as needed; but haven't found a good migration strategy yet. --Xover (talk) 07:44, 23 August 2019 (UTC)

Index categorisation ...
Hi, with the new change to categorisation, where does the categorisation sit? If I view the raw index page, it isn't showing. With respect to manual categorisation, at the moment we have amounts of index categorisation based on templates, predominantly index transcluded and index validated date and I cannot see an easy means to migrate. Thanks. — billinghurst  sDrewth  06:56, 23 August 2019 (UTC)
 * Hi sDrewth! Sorry for the late answer. The categorisation is not stored in a "proofreadpage_index_template" template parameter but at the end of the Wikitext serialization of the Index: page. See, for example this frwikisource page. Tpt (talk) 18:44, 25 August 2019 (UTC)

BHL Upload
Would it be possible for you to fork/ adapt your IA Upload tool so that it can also upload files from the Biodiversity Heritage Library (example: ), please? If it helps, BHL uses COinS markup to emit metadata. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 10:02, 13 September 2019 (UTC)
 * Sorry for the late answer. On the file I tried, BHL volumes seem to also be in InternetArchive with a link displayed in the BHL website under the "Download Contents" menu. I believe it covers most of your use case. We might make it slightly easier by making IA Upload discover this link to trigger an import from the IA version of the volume but I am not sure it will bring much value. Tpt (talk) 14:42, 19 September 2019 (UTC)
 * Ah, I wasn't aware of that - it's a great find and will be very useful. Thank you. Andy Mabbett ( Pigsonthewing ); Talk to Andy; Andy's edits 15:47, 19 September 2019 (UTC)