Page MenuHomePhabricator

[S/M]Explore ways to apply CAT blacklist updates to previously tagged images
Closed, ResolvedPublicBUG REPORT

Description

We have this:
Updates to the CAT blacklist prevent undesired tags from appearing on files that go through the Machine Vision algorithm after the blacklist was updated, but those changes don't get applied to previously analyzed files

We want this:

Apply the most current blacklist rules to all images

Screenshots (if possible):

Acceptance Criteria:

  • The most recent blacklist update applies to older files with suggested tags
  • This functionality is available for non-Commons clients of SuggestedTags as well (like the Android SuggestedEdits feature)

COVID-19 Deployment Criteria

  • Can you roll back this change without lasting impact?
    1. A recovery plan is required as this will help identify our capacity for recovering from the failure
    2. THIS IS A KEY QUESTION, if you can’t answer it, you shouldn’t deploy
  • Is specialized knowledge required to support this change in production? If so, are there multiple people with this knowledge?
  • Is there a way to increase confidence about the correctness of this change?
    1. Reviews (Design, Code, etc)
    2. Testing coverage (unit tests, integration tests)
    3. Manual testing (e.g. Beta, vagrant, docker)

Event Timeline

@Cparle I'd like to do this one ASAP but there are a lot of questions about how to do this in a performant fashion and whether this might be better off as a front-end check (perhaps while we're also exploring potential front-end solutions to T234457)

Simplest thing would be a simple php script to delete all blacklisted IDs from the machine_vision_* tables that we could run whenever the blacklist changes

Ramsey-WMF renamed this task from Explore ways to apply CAT blacklist updates to previously tagged images to [S/M]Explore ways to apply CAT blacklist updates to previously tagged images.Apr 3 2020, 5:34 PM

Cormac tentatively estimates 2 days of work to do a PHP maintenance script, starting Monday.

Change 586377 had a related patch set uploaded (by Cparle; owner: Cparle):
[mediawiki/extensions/MachineVision@master] Maintenance script to remove blacklisted wikidata items from suggestions

https://gerrit.wikimedia.org/r/586377

Change 586377 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@master] Maintenance script to remove blacklisted wikidata items from suggestions

https://gerrit.wikimedia.org/r/586377

Change 588484 had a related patch set uploaded (by Mholloway; owner: Cparle):
[mediawiki/extensions/MachineVision@wmf/1.35.0-wmf.27] Maintenance script to remove blacklisted wikidata items from suggestions

https://gerrit.wikimedia.org/r/588484

Change 588484 merged by jenkins-bot:
[mediawiki/extensions/MachineVision@wmf/1.35.0-wmf.27] Maintenance script to remove blacklisted wikidata items from suggestions

https://gerrit.wikimedia.org/r/588484

Mentioned in SAL (#wikimedia-operations) [2020-04-13T21:32:22Z] <mholloway-shell@deploy1001> Synchronized php-1.35.0-wmf.27/extensions/MachineVision: Add script to apply blacklist to current labels (T249273) (duration: 00m 58s)

Mentioned in SAL (#wikimedia-operations) [2020-04-13T21:43:21Z] <mdholloway> ran extensions/MachineVision/maintenance/removeBlacklistedSuggestions.php on commonswiki (T249273)

This has been working on production for a week now and looks like there are no problems.