Summary
There has been an increased rate of 404 errors from Thumbor and a spike in the number of unscannable files for MediaModeration on Grafana. We should investigate why this is the case and fix it.
Background
- MediaModeration is attempting to scan files using thumbnails first and then trying the source file if that fails
- We recently made changes in T388115: MediaModeration: Set requested thumbnail width by default to 330px to match new thumbnail pre-defined sizes and T353442: Update MediaModerationPhotoDNAServiceProvider::getThumbnailForFile to choose a thumbnail that meets minimum height requirements for PhotoDNA API to attempt to get a file that meets the dimension requirements of PhotoDNA
- However, the spike in the failure rate and 404s came before the changes had been deployed
- The spike in unscannable images came after the deployment of changes.
- At the same time the update metrics script and hourly scan of all but Wikimedia Commons has been moved to k8s in T385799: Migrate MediaModeration jobs to mw-cron
- These issues also don't seem to align with the changes that were made either. However, the increase in the unscannable metric appears to align with the deployment of scanning all but Wikimedia Commons (the change is deployed and then the metric starts spiking the next day)
- On further investigation, these issues appear limited to just the scan on Wikimedia Commons.
- Because we scan close to upload, it may help if we add some delay in actually attempting to scan the images and/or re-try the images at a later time.
Screenshots
| Rate of 404s and thumbnail failures | Spike in unscannable images |
|---|---|
Acceptance criteria
- We no longer see a spike in the number of unscannable images
- The rates of 404 responses from Thumbor decreases





