Page MenuHomePhabricator

Uploaded new versions of existing images still appear in SuggestedTags user queues
Open, Needs TriagePublicBUG REPORT

Description

Steps to Reproduce:

  • upload a new version of an existing file
  • wait the required period for the notification/upload to the Vision API

Actual Results:

  • a notification will be sent and the image appears in the User upload tab of SuggestedTags

Expected Results:

  • new versions of existing files should be ignored by Suggested Tags (no appearance in the queue and no notifications sent)

Event Timeline

@Ramsey-WMF The easiest way to handle this would be to ignore all files with a previous version, which would include the case defined in the task description, and reversions.

  1. Do we want to ignore reversions? I think we do...
  2. This would mean if we decided to fetch annotations for a big batch of existing images in the future for some reason, any image with more than 1 version would be ignored. Is this a possibility that we need to account for?

Change 607861 had a related patch set uploaded (by Anne Tomasevich; owner: Anne Tomasevich):
[mediawiki/extensions/MachineVision@master] Do not include noninitial versions of files in the queue

https://gerrit.wikimedia.org/r/607861

Change 607861 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Ignore new versions of existing files on upload

https://gerrit.wikimedia.org/r/607861

Doesn't seem to be working ☹

Tested this bu uploading a new version of https://commons.wikimedia.org/wiki/File:Setubal-fog.jpg

After the 48 hour period, I got a notification that I had new images to review in my queue.

@Cparle I'm having trouble pinpointing what's going on here, perhaps we can discuss here or synchronously sometime?

The goal of this task is to ensure that new versions of an image don't trigger image annotations (or the associated notifications). My original patch added some code to the onUploadComplete to bail early if the file has more than 1 version, which stops annotations from being fetched at this time. I can confirm that this is working locally. However, in production, annotations are being fetched for image revisions, even if the file already has more than 1 depicts statements.

I was under the impression that we were only fetching annotations for images on upload complete, and not fetching them for batches on an ad hoc basis or at regular intervals. Is this assumption incorrect? I see there are a couple of maintenance classes that build batches of files so annotations can be fetched for them, but I didn't think either of them would be running right now. Maybe you can help me figure out how this is currently configured, so I know where I need to edit the code to prevent new revision uploads, but not necessarily all images with multiple revisions, from getting annotations? This way we accomplish the goal of this ticket without restricting our ability to get annotations for images with multiple revisions if we ever fetch annotations for a large batch of specific images, as we've done in the past.

AnneT added a subscriber: AnneT.

Assigning to Cormac for now to respond to my comment above, but this one is lower priority than MediaSearch work, so I'll move it back to Ready for Development.