Page MenuHomePhabricator

Filter out pages that have been deleted from CopyPatrol interface
Closed, ResolvedPublic2 Estimated Story Points

Description

If a page has been deleted, don't show it in any interfaces at http://tools.wmflabs.org/copypatrol/

Event Timeline

DannyH triaged this task as Medium priority.May 3 2016, 6:08 PM
DannyH set the point value for this task to 2.
DannyH renamed this task from Filter out pages that have been deleted from CopyPatrol interface to Filter out pages that have been deleted from CopyPatrol interface (hold for dev period).Jun 20 2016, 11:38 PM
DannyH updated the task description. (Show Details)
DannyH subscribed.
DannyH renamed this task from Filter out pages that have been deleted from CopyPatrol interface (hold for dev period) to Filter out pages that have been deleted from CopyPatrol interface .Jun 24 2016, 8:10 PM
DannyH updated the task description. (Show Details)

Now that people are actively using Copy Patrol, I think the redlink cases are just getting in the way. Let's talk at the next meeting about whether we still need them for developer testing.

Note that implementation for this task might closely rely on how we implement T138984

I propose we do this only for "Open cases" but still keep the redlinked pages in the "All cases" filter. If we don't, there really is no way to look back on records that were plagiarized new page edits and have been fixed by page deletion.

CC @DannyH @MusikAnimal

Definitely. I was also going to have CopyPatrol automatically mark deleted pages as reviewed, perhaps using the username of our bot?

Definitely. I was also going to have CopyPatrol automatically mark deleted pages as reviewed, perhaps using the username of our bot?

Sounds like a good idea. Although we'd want to wait a while after the article gets deleted in case the user who deleted it wants to mark it fixed on the tool by themselves. We don't want to steal their glory. :)

With this implementation you may end up with a page of say, 48 records instead of 50, but that's perfectly fine I think considering how much cleaner the code is. And again, if I hit the page and get 48, the next person who loads will see 50 because the two deleted pages were already marked as reviewed.

Sounds like a good idea. Although we'd want to wait a while after the article gets deleted in case the user who deleted it wants to mark it fixed on the tool by themselves. We don't want to steal their glory. :)

Also, per above, we have no logic that prevents people from re-reviewing something that someone else already reviewed, so this shouldn't be a concern. E.g. me as admin sees this page needs deleting, so I delete it, go back to the CopyPatrol tab which I never closed, and mark it as no action needed.

Now, it is possible that there'd be an edge case timing conflict where Community Tech Bot stole my review, but I think this isn't a huge concern.

Also, per above, we have no logic that prevents people from re-reviewing something that someone else already reviewed, so this shouldn't be a concern.

But see https://github.com/Niharika29/PlagiabotWeb/commit/3b669fdcd460ea5fe9f6236d51ad4377d5683d47.

I'm having some reservations about this one. Would like to get Diannaa's feedback on it...

Also, per above, we have no logic that prevents people from re-reviewing something that someone else already reviewed, so this shouldn't be a concern.

But see https://github.com/Niharika29/PlagiabotWeb/commit/3b669fdcd460ea5fe9f6236d51ad4377d5683d47.

That's just for undoing reviews I think. Right now there could be race conditions and people end up reviewing the same thing, the second reviewer getting the credit. That being said, the likelihood of Community Tech Bot getting the credit here should be very low, since whatever user of ours that deleted the page is going to do so then come back and click on fixed, unlikely to hit refresh. We should explore ways to ensure no one else is getting credit for already-done reviews, but I think that's more just for record keeping and the "gaming" aspect of the tool. The more important thing is that it was actually fixed, not who fixed it.

We're showing 50 cases on the page. If that includes redlinks, then that's cutting down on the number that you can see before hitting Load more.

We're showing 50 cases on the page. If that includes redlinks, then that's cutting down on the number that you can see before hitting Load more.

Yeah... as I said above I think this is insignificant as usually there's maybe 2 or 3 deleted pages in one 50-page block. I don't think it really matters anyway, you can simply click "Load more" when you're done with a page, most won't even notice that it's 47 instead of 50. Lastly, it only takes one page load. So if I load it and the 3 deleted pages were marked as reviewed, the next reviewer who loads the page 10 seconds later will see a full 50 pages. Keeping 50 entries was my original plan but it's complicated to implement.

I'm having some reservations about this one. Would like to get Diannaa's feedback on it...

Very interested to hear more! I believe this was Diannaa's idea.

@MusikAnimal: If we do have it automatically reviewed by Community Tech bot, would it be marked as "Page fixed" or "No action needed" (or neither)?

@MusikAnimal: If we do have it automatically reviewed by Community Tech bot, would it be marked as "Page fixed" or "No action needed" (or neither)?

"No action needed" is what I have it doing now. I think "Page fixed" would imply the bot fixed it. I thought about some super smart magic where it could see who deleted the page, and mark them as the reviewer – potentially even if they are not a CopyPatrol user. Give the credit where credit is due, right? But again that's if we really think giving credit is that important, or in this case just overkill :)

Dianna's feedback was pretty ambiguous, so I've asked her to clarify.

I think the idea of removing the redlink pages was Diannaa's in the first place, wasn't it?

We also have at least one other user (Sphilbrick) in favour of removing them.

Looks like Dianna endorses the solution that Niharika and Leon came up with, so I'm fine with it. We'll need to remember to filter Community Tech bot from the leaderboard though.

@DannyH: Is the implementation that Leon did good for you? Rather than removing all redlinks entirely, it automatically marks redlinks as "No action needed" and reviewed by Community Tech bot.

Yes, I think that's a good solution.