Page MenuHomePhabricator

Investigate MediaModeration failures
Open, Needs TriagePublic

Description

Essex and I tried to run the MediaModeration script today, but as we looked in logstash, there were a ton of errors. Petr Pchelko confirmed it seemed like a high error rate so we paused to do more investigation.

To do this, we need to roll back the change made here https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/606239/9/wmf-config/InitialiseSettings.php#b6736 and set it to 'warning' so that we have more robust logs.

Event Timeline

@mepps What do you think about putting this script behind a CRON job once we get it fully fleshed out and stable?

@eigyan A good idea! I think the only issue is the starting and ending timestamps...

Change 708815 had a related patch set uploaded (by Eigyan; author: Eigyan):

[operations/mediawiki-config@master] wmf-config: Restore logging for mediamoderation script to better understand high error rate occurring when running script

https://gerrit.wikimedia.org/r/708815

Change 708832 had a related patch set uploaded (by Eigyan; author: Eigyan):

[operations/mediawiki-config@master] wmf-config: Restore logging for mediamoderation script to better understand high error rate occurring when running script

https://gerrit.wikimedia.org/r/708832

Change 708815 merged by jenkins-bot:

[operations/mediawiki-config@master] wmf-config: Restore logging for mediamoderation script to better understand high error rate occurring when running script

https://gerrit.wikimedia.org/r/708815

Mentioned in SAL (#wikimedia-operations) [2021-08-02T11:05:05Z] <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 26bcaafdcd57b1b7a78f9e0ad000325baaf36a72: Restore logging for mediamoderation script to better understand high error rate occurring when running script (T287511) (duration: 00m 57s)

@eigyan I found an error message! It looks like the api is returning "The given file could not be verified as an image". I'm still curious if this an appropriate rate of this error. I found this in mw-log.

I checked one of the images that got this error and it does look like a real image: https://commons.wikimedia.org/wiki/File:For%C3%AAt_@_Mont_Veyrier_(51122922841).jpg. So I'm not sure what to make of this.

@mepps Agreed. Seems like that type of failure at a high rate means the script may not be working as expected perhaps a bug. Thus my question of how do we run this script in a sandbox and in order to see these types of errors in a non-production environment.

@mepps Do we have an API documentation resource for the script? I'm guessing/hoping Petr would the person to ask?

@eigyan There's some documentation on the PhotoDNA site, but we may need a login for more access.

@drochford got new credentials. I'm curious if we need to update the access token used on prod. @ARamirez_WMF @Madalina It might be worth bringing this in the sprint because we never did get to run the script all the way. The next step would be to ask David to look up the token on the dashboard, and to ask SRE for what token is stored in $wgMediaModerationPhotoDNASubscriptionKey in the private config repo. We could also use the credentials and login to see the PhotoDNA documentation.