In order for CopyPatrol to work for French Wikipedia we will need to have Eranbot running on it. See T141379 for more info.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | Niharika | T145431 Epic: Port CopyPatrol to French Wikipedia | |||
Resolved | Niharika | T145432 Get Eranbot to run on French Wikipedia |
Event Timeline
I ran the script on frwiki but ran into an error. So I made a pull request for that. It's waiting on Eran's review.
I made the change in production to test the script anyway and it runs fine, with one caveat, it puts records from frwiki in the same database table as for enwiki, with the 'lang' field as 'fr' instead of 'en'.
This will warrant some more changes to Copypatrol and a change to Community Tech Bot to avoid marking all such records as false positive, which is what happened in the test run.
It's in progress. I added the script to crontab and did a test run which ran into an error with iThenticate being down or hitting the query limit. Trace here: P4438
I'm going to try and add error handling to the script. Pinged Eran about it, yet to hear back.
For now, the script is running fine and we have data: https://tools.wmflabs.org/plagiabot/fr?filter=all
Community Tech Bot is auto reviewing all of the records because right now it only checks against enwiki. I'll close this task after fixing this.
I've gone over the code and it seems alright, it's not explicitly querying against enwiki API but it's still somehow checking for dead pages against enwiki only. The API querying code seems fine because the same part is used for getting editor and editor contributions etc. Somehow just this part is broken.
Maybe @Samwilson or @MusikAnimal can take a look. :)
Looking into it. I did notice red links won't show for user/user talk. When you query for User/User talk pages on another wiki it will work (example), automatically normalizing to that wiki's namespace title, but in the code we're comparing the hard-coded "User" and "User talk" strings. I think we can somehow do this without creating a hash of each wiki's user / user talk namespace titles.
Still looking into this... a few other things: I had to update the production CopyPatrol because French articles were showing up =P Also it seems the comparison tool is loading enwiki articles and not frwiki. Also working some updates to the wikiproject bot task