Sat, Jul 31
@DannyS712 Thank you! I have understood why e.g. https://phabricator.wikimedia.org/diffusion/EPRP/browse/master/includes/Pagination/Pagination.php$25 is not flagged by these sniffs. It's because abstract methods are not covered by the lint: https://github.com/wikimedia/mediawiki-tools-codesniffer/blob/5f813f727ee47d5c90981b44e82f0c5a0263e933/MediaWiki/Sniffs/Commenting/FunctionCommentSniff.php#L198
Mon, Jul 19
Thank you @Ankry, @Xover @Inductiveload! So I guess rel="preload" seems the way to go for the next page image. What about using rel="prefetch" for the previous one? This way, it gets a good chance to be loaded too but with less priority. What do you think about it?
Sat, Jul 17
@Ankry Thank you! So, I guess that the main goal is to make sure that the image thumbnail is already generated when a Page: page is displayed.
If I remind correctly preload is to ask the browser to aggressively fetch a content for the current page rendering and prefetch is to ask the browser to load the content for future use in the navigation.
Tue, Jul 13
Jun 30 2021
But aiui the granularity would be per-project and not per-index. How would a project start experimenting with this or avoid the need for a big bang migration? Use two toc fields in the index, one for the traditional way and one for a "Wikidata-based toc"? That would necessitate PRP differentiate between "present" and "non-empty" for the index data field tagged with "data": "toc", I think. That sounds a little more complicated, but is perhaps not too much so?
Jun 29 2021
Jun 8 2021
@Tpt Is there a technical reason for this?
May 22 2021
Done and deployed
Sorry for stepping in. A maybe relevant task when designing a new zooming system: T43614
Mar 27 2021
After thinking a little bit (and asking my wife who does not contribute to Wikisource), I have a feeling using an indicator is maybe not highlighting enough the proofreading level information: it's the first thing people are often looking for if they go to a page: page. But I am not an UX design expert so maybe an indicator is good enough.
Thank you for pushing this. I am a bit afraid it might create a backlash with some contributors feeling that the proofreading level is not highlighted enough. It might be nice to start a discussion on the biggest Wikisource communities. A first step is maybe to move the quality level from the <pages> tag and in the Index namespaces to an indicator.
This is presumably very nearly T239033, since if the extension can add a "Source" link for a mainspace page user of <pages/>, it can do it in any other namespace?
@Inductiveload Yes, I guess it should be also done in onSkinTemplateNavigation. The tricky part is that the "source" link is not known from the current page title but from the parser output, and onSkinTemplateNavigation do not have access to the parser output. It's why I have not done the migration to server side yet. There is maybe a good solution, I have not taken much time to look for an alternative yet.
Mar 26 2021
Mar 25 2021
@Tpt, is that fix complete or is there more that needs to be considered.
@Inductiveload https://gerrit.wikimedia.org/r/672760 generated "null pointer exceptions" during Page: pages rendering if the page was not already connected to an index (trace in T278379). Antoine and I reverted the change to fix this error. It was the simplest way we found to fix this error quickly on Wikisource, sorry. I have started to write a change that should fix this problem: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/674691/1/includes/Page/PageContent.php
Feel free to integrate it into your change and push it again to gerrit.
Sorry again the quick revert.
Mar 19 2021
@Tpt: hmm, but how will you then get the prp-pagequality-N qualityN classes applied? Or do that manually?
It should be automatically as part by the LinkRenderer: the classes are added by a linker hook.
@Inductiveload I believe we could switch <pagelist> rendering to outputting HTML. $parser->getLinkRenderer()->makeLink allows to easily reuse MediaWiki linking system.
Mar 18 2021
@Inductiveload Yes, ProofreadPageInit.php is a good place for that.
Mar 16 2021
Not sure if that's a use case that is needed?
@Inductiveload I believe it should be better to use templatestyle because it does a lot of nice and useful CSS transformation and sanitization.
Mar 9 2021
Mar 8 2021
Mar 4 2021
Feb 19 2021
Jan 22 2021
Why is MOBI described as being for Calibre? I would've thought EPUB would be closer to Calibre's "default" format.
Jan 21 2021
Jan 20 2021
@Yash9265 wrote a change that removes navigation links when transcluding Special:IndexPages. I believe this problem is now solved.
I believe we want to display the export links only in namespaces where ProofreadPage allows the transclusion to happen and displays its navigation links.
For now it's only the main namespace but some Wikisource requested to be able to use it in other namespace (T53980).
Possible way to fix both problems at the same time is maybe create a per-wiki config inside of ProofreadPage and reuse it for the Wsexport link.
Jan 15 2021
We don't show the links on https://wikisource.org. I don't know if this matters. I don't know if users use that site.
Jan 13 2021
The change adding "thai" numerals support is now deployed on Wikisource.
Sorry, I confused this task with an other one.
My 2 cents: I like the idea of using Calibre ePub to ePub to do the file split in case of too big files. The current implementation in Wsexport is very bad. An other option would be to fix it inside of Wsexport by having a look of what Calibre is actually doing internally.
Jan 10 2021
So, the question is: Can the ProofreadPage extension be smarter about where the pagenum template is inserted to avoid putting it in "dead" table space?
Dec 16 2020
Dec 2 2020
Thank you! It looks like a great plan!
Nov 28 2020
I just had a look at the logs. The OPDS update cron job started to fail in September because of out of memory errors.
I just wrote a patch that should make the addition of all numbering system supported by CLDR/ICU easy: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/644010
Hi! It's definitely doable.
Oct 27 2020
Oct 21 2020
Yes, I believe it would be better.
Oct 15 2020
… then I probably still don't get how the scary output shown in T263371 was created. If this can not happen in production, why is there a ticket?
@thiemowmde T263371 is indeed an XSS attack vector if you output directly the file content in HTML. But ProofreadPage only uses the file content to prefill a Wikitext content area when a Page: page is created. So, I guess there are no extra threat here compared with just allowing anyone to edit the wiki.
Probably caused by https://gerrit.wikimedia.org/r/c/mediawiki/extensions/ProofreadPage/+/628753
This seems to be indeed the problem cause. I have submitted a revert for review: https://gerrit.wikimedia.org/r/634105
Oct 9 2020
@Tderrick Sadly, the Wikisource transcription system does not keep word coordinates. There might be something to do by trying to match Wikisource transcription with an OCR with coordinates to attempt to fix the OCR with the transcription but it's not an easy task at all.
ALTO XML seems to be an XML format designed for OCR output. It encodes the text positioning data that we do not keep in Wikitext. It's closer to the DjVu OCR format.
Oct 7 2020
Oct 5 2020
Oct 4 2020
Oct 3 2020
Oct 2 2020
Aug 1 2020
It's strange. There should not be any cache on this page.
The special page we are talking about is Special:IndexPages. To display something you need to have already created some pages in the "Index:" namespace.
The code of this special page is in includes/Special/SpecialProofreadPages.php.php.
Jul 24 2020
As far as I’m aware, the real URL in RDF is more like http://commons.wikimedia.org/wiki/Special:FilePath/Leon%20Cogniet%20-%20Jean-Francois%20Champollion.jpg – the query service UI rewrites it to the file description URL (/wiki/File:) on display.
I also don’t understand your first example – is sdoc:P18 meant to be something like sdoc:M123 instead?
Sorry for this problem.
I guess that Wikidata concept URIs are using http:// because it is what is usually done by RDF datasets (DBpedia...), mostly for backward compatibility reasons.
I would be slightly in favor of using http:// URI for Commons entities in order to have all Wikibase entities and relations using http:// instead of having some with http:// and some with https;//.
Jul 21 2020
Jul 16 2020
@Samwilson Yes, it's exactly what I mean. This way, we don't need to install all locales in the Wsexport servers.
Jul 14 2020
It's a great idea. Thank you!
Some relevant links:
The PHP Intl extension is now much more common than it was in 2012. It might be relevant to use here the IntlDateFormatter class that allows to easily fix this problem.
I believe that the credit list is not a "hard" requirement. For example, the common pattern for citing Wikipedia articles is only to link to the Wikipedia page and stating that the author lists could be found here. And Wikisource contributors have a much weaker authorship relation to the content than the Wikipedia contributors. So, I guess that a rewording of the credits page might to the job (but I'm not a lawyer...).
Jul 13 2020
Jul 9 2020
It should work now. The lighttpd configuration was not updated for the ToolsForge URL change.
Jun 28 2020
Jun 11 2020
Since yesterday, if a Wikisource page is connected to a Wikidata item that states that it is an edition of the work using P629, the sitelinks of the work item are displayed on the page in the "In other languages" sidebar.
The next step is to look for the other editions on Wikidata using P747.
The changes have been merged last week
Jun 6 2020
May 31 2020
@Mahastama I have been bold and just created a beginning of script here: https://id.wikisource.org/wiki/Pengguna:Mahastama/OCR.js
I have put it in one of your user subpage to allow you to edit it. I hope it is fine for you.
It adds an "OCR" button to the page pages and calls the Trawaca API just like you presented.
However, currently the Trawaca API fails with an "authorization" error.
May you or one of the OCR developer have a look at it?
To reproduce it, you have to load the script (just like explained by @Xover, and try to run the OCR on any Page: page.
Apr 30 2020
I just tried to add recursiveTagParseFully and made a quick test.