Page MenuHomePhabricator

Implement "checker" functionality into ProofreadPage
Open, Needs TriagePublic

Description

The checker tool (https://checker.toolforge.org/) is an integral part of the proofreading workflow on Wikisources, or so I'm told. The source code is https://github.com/legoktm/checker/blob/master/app.py - it makes some pretty simple database queries that seem like it would be easy to integrate into a dedicated special page in the ProofreadPage extension.

This is forked out of a discussion at https://meta.wikimedia.org/wiki/User_talk:MZMcBride#Can't_see_transcluded_pages.

Event Timeline

Usage across different Wikisources, going back to Feb 2016:

tools.checker@tools-sgebastion-10:~$ grep -oh '[a-z]*wikisource' uwsgi.log | sort | uniq -c | sort -rn
  44729 enwikisource
   9111 frwikisource
   6764 itwikisource
   6493 ruwikisource
   2046 bnwikisource
   1825 ptwikisource
    586 tawikisource
    450 eswikisource
    374 plwikisource
    324 cawikisource
    254 hywikisource
    253 svwikisource
    230 nowikisource
    203 elwikisource
    168 ukwikisource
    153 tewikisource
    150 dewikisource
    145 bewikisource
    109 brwikisource
    108 srwikisource
     88 lawikisource
     80 aswikisource
     78 vecwikisource
     55 slwikisource
     53 dawikisource

I dropped 21 more Wikisources since they had less than 50 requests for vague privacy reasons.

We already kinda expose this information (abeit not in such a condensed form). It might be interesting to look into exposing this info directly via javascript on the Index: page also.

If it helps, on it.ws I developed a gadget that shows a "statistics" table on every Index, with the number of pages that have been proofread, validated etc. and also how many pages are not transcluded ("Non trascluse"). It calls the "embeddedin" API to find any page that have been at least proofread but is not transcluded in ns0. At the top of the page, it shows a warning if there are such pages. Also, clicking on the "Non trascluse" link, these pages are highlighted in the pagelist, to find them easily.

To see it in action, just open a random Index page from https://it.wikisource.org/wiki/Speciale:CasualeInCategoria/Sommari, you'll see a table just after the pagelist.
The code is here: https://it.wikisource.org/wiki/MediaWiki:Gadget-indexPagesStatistics.js

Assigning to myself since I'm want to experiment/implement a Special page concept that shows metrics/warnings wrt to Index pages.