If a page has been deleted, don't show it in any interfaces at http://tools.wmflabs.org/copypatrol/
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | None | T116957 Plagiarism detection tools for text (tracking) | |||
Resolved | • TBolliger | T120435 Improve the plagiarism detection bot | |||
Resolved | • TBolliger | T131583 Epic: Make a tool labs interface for Plagiabot aka Eranbot | |||
Resolved | MusikAnimal | T134289 Filter out pages that have been deleted from CopyPatrol interface | |||
Resolved | MusikAnimal | T139082 Add API framework to CopyPatrol |
Event Timeline
Now that people are actively using Copy Patrol, I think the redlink cases are just getting in the way. Let's talk at the next meeting about whether we still need them for developer testing.
Note that implementation for this task might closely rely on how we implement T138984
I propose we do this only for "Open cases" but still keep the redlinked pages in the "All cases" filter. If we don't, there really is no way to look back on records that were plagiarized new page edits and have been fixed by page deletion.
Definitely. I was also going to have CopyPatrol automatically mark deleted pages as reviewed, perhaps using the username of our bot?
Sounds like a good idea. Although we'd want to wait a while after the article gets deleted in case the user who deleted it wants to mark it fixed on the tool by themselves. We don't want to steal their glory. :)
Requesting review of pull request https://github.com/Niharika29/PlagiabotWeb/pull/17/files
cc @Niharika @kaldari
With this implementation you may end up with a page of say, 48 records instead of 50, but that's perfectly fine I think considering how much cleaner the code is. And again, if I hit the page and get 48, the next person who loads will see 50 because the two deleted pages were already marked as reviewed.
Also, per above, we have no logic that prevents people from re-reviewing something that someone else already reviewed, so this shouldn't be a concern. E.g. me as admin sees this page needs deleting, so I delete it, go back to the CopyPatrol tab which I never closed, and mark it as no action needed.
Now, it is possible that there'd be an edge case timing conflict where Community Tech Bot stole my review, but I think this isn't a huge concern.
Also, per above, we have no logic that prevents people from re-reviewing something that someone else already reviewed, so this shouldn't be a concern.
But see https://github.com/Niharika29/PlagiabotWeb/commit/3b669fdcd460ea5fe9f6236d51ad4377d5683d47.
I'm having some reservations about this one. Would like to get Diannaa's feedback on it...
That's just for undoing reviews I think. Right now there could be race conditions and people end up reviewing the same thing, the second reviewer getting the credit. That being said, the likelihood of Community Tech Bot getting the credit here should be very low, since whatever user of ours that deleted the page is going to do so then come back and click on fixed, unlikely to hit refresh. We should explore ways to ensure no one else is getting credit for already-done reviews, but I think that's more just for record keeping and the "gaming" aspect of the tool. The more important thing is that it was actually fixed, not who fixed it.
We're showing 50 cases on the page. If that includes redlinks, then that's cutting down on the number that you can see before hitting Load more.
Yeah... as I said above I think this is insignificant as usually there's maybe 2 or 3 deleted pages in one 50-page block. I don't think it really matters anyway, you can simply click "Load more" when you're done with a page, most won't even notice that it's 47 instead of 50. Lastly, it only takes one page load. So if I load it and the 3 deleted pages were marked as reviewed, the next reviewer who loads the page 10 seconds later will see a full 50 pages. Keeping 50 entries was my original plan but it's complicated to implement.
Very interested to hear more! I believe this was Diannaa's idea.
@MusikAnimal: If we do have it automatically reviewed by Community Tech bot, would it be marked as "Page fixed" or "No action needed" (or neither)?
"No action needed" is what I have it doing now. I think "Page fixed" would imply the bot fixed it. I thought about some super smart magic where it could see who deleted the page, and mark them as the reviewer – potentially even if they are not a CopyPatrol user. Give the credit where credit is due, right? But again that's if we really think giving credit is that important, or in this case just overkill :)
I think the idea of removing the redlink pages was Diannaa's in the first place, wasn't it?
Looks like Dianna endorses the solution that Niharika and Leon came up with, so I'm fine with it. We'll need to remember to filter Community Tech bot from the leaderboard though.
@DannyH: Is the implementation that Leon did good for you? Rather than removing all redlinks entirely, it automatically marks redlinks as "No action needed" and reviewed by Community Tech bot.