Page MenuHomePhabricator

Suggested Tags should not consider reverts as "personal uploads"
Closed, ResolvedPublic

Description

As an admin on Commons, I have to revert a lot of images. For example, someone uploads a drawing of a penis as the latest revision of a portrait of Donald Trump. I would then revert that image back to the original version. The Suggested Tags tool considers all of these reverts as "personal uploads", so when I click on the "personal uploads" tab and go through the images, most of them are images I don't even recognize or have anything to do with other than reverting some years ago. It would be nice if the tool filtered out reverts and didn't consider them as belonging to me personally.

Event Timeline

Ramsey-WMF subscribed.

Hi Anne. This is less of a priority than the other tickets we talked about today, but if it's an easy fix would appreciate it if you could knock it out 🙏

The only way I know of that you can detect reverts is to look for the string "Reverted to version" in the image comment. See, for example, https://commons.wikimedia.org/wiki/File:Glandulicereus_thurberi_sonora.jpg.

Huh, I didn't know about Special:SuggestedTags, looks cool…

Many of my own "personal uploads" are filed where I'm not the original author, but only edited the file to fix some technical issue, or even re-uploaded it with no changes for technical reasons (e.g. the very first suggestion was https://commons.wikimedia.org/wiki/File:Lueneburg_IMGP9466_wp.jpg). I might not even know what is actually depicted.

It might make sense to fix both of these problem at the same time, by suggesting files where the user uploaded the oldest version, rather than the latest one.

Change 579381 had a related patch set uploaded (by Anne Tomasevich; owner: Anne Tomasevich):
[mediawiki/extensions/MachineVision@master] Assign images to original uploader

https://gerrit.wikimedia.org/r/579381

Change 579381 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Assign images to original uploader

https://gerrit.wikimedia.org/r/579381

The fix for this should be on production now. Is this working as expected, @kaldari and @matmarex ?

@Ramsey-WMF - It's definitely still showing me images where I'm not the original uploader, for example, https://commons.wikimedia.org/wiki/File:El_Gouna_Turtle_House_R01.jpg. Is the list of images pre-populated or does it pick new images on the fly? Perhaps those images were assigned to me before the change went into place.

Perhaps those images were assigned to me before the change went into place.

Yeah, I think that is the case and I suspect that the first patch will prevent future instances of this problem but not current ones. Looking to @AnneT for further details/follow-up

Yeah, uploads are assigned to users when suggestions are fetched from the provider, so existing uploads assigned to you will remain but new ones that you revert shouldn't be added to your stack.

Is it okay if old (fairly rare) instances of this problem remain in the queue but new ones are prevented, @kaldari and @matmarex ?

@Ramsey-WMF - Maybe, but there seems to be a related bug... If I click "Skip" for an image it doesn't seem to get removed from my personal queue. So my personal queue is now made up mostly of images that I either don't have anything to do with or images for which Google doesn't have any helpful suggestions (which is common). In other words, my personal queue is slowly turning into garbage. If the "Skip" button is fixed, I think this bug will cease to be a problem though.

@Ramsey-WMF - Maybe, but there seems to be a related bug... If I click "Skip" for an image it doesn't seem to get removed from my personal queue. So my personal queue is now made up mostly of images that I either don't have anything to do with or images for which Google doesn't have any helpful suggestions (which is common). In other words, my personal queue is slowly turning into garbage. If the "Skip" button is fixed, I think this bug will cease to be a problem though.

You'll soon (this week) have the ability to add a custom tag to any image in your queue (and thus remove it). Is that adequate or would you still want an explicit "discard" functionality?

Hmm. Why not just make the "Skip" button remove it from your queue? It's not like you can't add more claims later manually. What's the use case for people repeatedly skipping the same images? For example, I've skipped this image's tag suggestions at least 20 times now:

Screen Shot 2020-04-06 at 4.08.00 PM.png (766×858 px, 1 MB)

Is there any logical reason to show it to me again? It just seems like a waste of time. And the more my queue grows (it's at 193 images currently), the longer it takes me to find recent ones that might actually be actionable. If the queue were presented in reverse chronological order it wouldn't be such a big deal. But when you have to click "Skip" 100 times to get to your newly uploaded image that you were just notified about, it starts to get a bit aggravating :(

Since we first started with the tool, we have actually received user feedback that people often want to "come back to it later" for certain images and we're trying to find a happy medium with users who want to remove images from their queue completely. At the moment, we're primarily investigating whether the explicit option for a user to add their own tag if none of the suggested ones work and get it out of the queue will serve as a clear middle ground for the various user desires.

@Ramsey-WMF - I still imagine there will be a use case for removing images without adding a tag. What happens currently if someone else tags one of the images in my personal queue before I do? Does it remain in my queue or get removed? If it remains, that would be a good example of a case where there may be no need to add more tags.

Here's a slightly long but hopefully thorough answer 🙂

Currently if someone adds a depicts statement to one of your files before you do, it stays in the queue. There are several possible scenarios when tagging personal uploads via SuggestedTags:

  1. The image currently has no depicts statements (this is the case the vast majority of the time)
  2. Someone else already added depicts statements for that image, but not the ones you would add (this is less common, with only 3% of Commons files with depicts statements so far and most of those added via bots that aren't running anymore, but it still happens occasionally)
  3. Someone else already added depicts statements for that image and added all the exact value(s) you would add (very rare, in part because we don't queue files that have >1 depicts statement already)

No.2 is one reason why we keep the image in the queue (besides performance issues, added code complexity, etc.). Also, initial usage tests showed that not all users are keeping an eye on watchlists so they don't know when someone else edits their image and then wonder why one of their uploads is missing from their Personal Uploads queue.

Although No. 3 is pretty rare it does indeed happen, almost entirely with the images in "popular" queue, but ideally it ends up not mattering much because the tool shouldn't add a statement that already exists. Unfortunately, there's currently a regression that broke that check (plus, it should ideally be part of Wikibase which we are looking into with WMDE). We'll get that fix up ASAP.

With all that said, I want to emphasize that we are still considering a "discard" option. We did a couple rounds of design-dev sessions and came up with an approach that satisfied both the "I'll get to it later" and the "remove from queue" use cases while not being confusing, but it does add complexity to a tool that aims to be as simple as possible so we're keeping that option in the back pocket for now, especially if the cases where it is needed are pretty rare and "Add Tag" covers most of the cases where discard would be needed.

The original cause of the problem is fixed, and should not occur in the future. Since we started this, we've added the Add Tag functionality and that appears to be doing the job for the vast majority of use cases (just add a tag to get rid of it). Additionally, we've fixed the issue where duplicate statements were being added.