Page MenuHomePhabricator

MediaModeration: Investigate error increase for image transformation
Closed, ResolvedPublic

Description

Summary

There have been two days in a row with an increase in the number of errors and warnings in the MediaModeration PhotoDNA dashboard related to image Transformarion (x4) and lookups (x3), although today (Friday 31st) it looks like the numbers are lower.

Background

image.png (835×1 px, 145 KB)

Technical details

It appears that the success rate for getting thumbnails from the Thumbor service has decreased. It represents about 4% of images failing to get a thumbnail up from 1%. The increase in the number of source files being too large is a result of this, as if we don't have a thumbnail we try to use the source file.

Event Timeline

hector.arroyo renamed this task from {component}: {use imperative mood to describe desired outcome} to MediaModeration: Investigate increase in errors for iage transformation.Jan 31 2025, 1:01 PM
hector.arroyo renamed this task from MediaModeration: Investigate increase in errors for iage transformation to MediaModeration: Investigate error increase for image transformation.

Errors from thumbor are stable:

image.png (899×1 px, 364 KB)

image.png (899×1 px, 238 KB)

Change #1115899 had a related patch set uploaded (by Hnowlan; author: Hnowlan):

[operations/deployment-charts@master] jobqueue: bump ThumbnailRender concurrency

https://gerrit.wikimedia.org/r/1115899

Work for the TSP team that we can do towards this task:

  1. Log the HTTP response codes from PhotoDNA - T385448
  2. Increase the request timeout for requests to Thumbor - T385450
  3. Re-attempt images that we failed to scan 1 day after their upload, as thumbnails are likely to have generated by then - T385478

Change #1115899 merged by jenkins-bot:

[operations/deployment-charts@master] jobqueue: bump ThumbnailRender concurrency

https://gerrit.wikimedia.org/r/1115899

The error rate has dropped back to it's normal levels, so I think we can close this. T385478: MediaModeration: Re-attempt scans of images scanned close to upload one day later should help increase the overall success rate further, but can be done later.