Copy Patrol shows "No editor found" on all new cases. The cases get updated with the editor info, but it seems to take maybe six hours. EranBot's page has the editor's info, apparently immediately. Is there a way that we can get that info for new cases?
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T116957 Plagiarism detection tools for text (tracking) | |||
Resolved | • TBolliger | T120435 Improve the plagiarism detection bot | |||
Resolved | • TBolliger | T131583 Epic: Make a tool labs interface for Plagiabot aka Eranbot | |||
Duplicate | None | T138984 "No editor found" lasts for hours | |||
Resolved | MusikAnimal | T139082 Add API framework to CopyPatrol |
Event Timeline
I'm guessing Eranbot uses the API, and we can too. I've got a quick demo up at https://tools.wmflabs.org/plagiabot (this is just as a proof of concept, not ready for code review :) It seems to go pretty fast, and the big advantage here of course is we're hitting production data so we'll always get the user info we want.
However the API sometimes refuses requests when it is overloaded, making us wait N seconds before attempting the query again. Sometimes this maxlag can get very high. I talked to Bryan and since we're only doing reads, I think we can adjust our configuration to be more aggressive so we don't run into this issue.
That being said, if we want to move forward with this approach I think we should adopt an API framework to clean up the code. This will be especially helpful once we add rollback and templating features, and if we throw in the bot credentials we can speed up the API queries even more. Ultimately with the editing features we might run into the maxlag issues, but we can deal with that when the time comes.
It looks like this isn't a problem anymore, because of the work on T139082: Add API framework to CopyPatrol.
Should we close this ticket?