Page MenuHomePhabricator

img_metadata missing
Open, Needs TriagePublic

Description

I parsed img_metadata in my GIF check report. I figured these errors would be fixed with newer software and an update script (T32961). Over the years more files appeared on the list. Currently, 3,211,696 files on Commons (8.89%) are missing metadata.

SELECT CONCAT(img_major_mime,"/",img_minor_mime) AS MIME, img_metadata,
       CONCAT("File:", REPLACE(img_name,"_"," ")) AS Example, COUNT(*) AS COUNT
FROM image
WHERE LENGTH(img_metadata)<9 /*smallest is 9 bytes: {"x";i:0} */
GROUP BY 1, 2 ORDER BY COUNT(*) DESC;
MIMEimg_metadataExampleCOUNT(*)
image/jpeg0File:"A Perspective View of Fort William" by Jan Van Ryne, 1754.jpg3,182,426
image/jpeg-1File:!-2013-wschowa-przyczyna-gorna-palac-abri.jpg16,140
image/png0File:"Après le bain" (dessin par Georges A. Gardenty, 1893).png8,179
audio/midiBlankFile:"Bebop-rebop" early bop phrase.mid4,817
image/gif0File:1. FCA Darmstadt.gif124
application/pdfb:0;File:A imprensa em Goa nos séculos XVI, XVII e XVIII.pdf10

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 19 2017, 3:41 PM

Which is used as a value to mean our metadata extractor couldnt understand the file format.

Dispenser updated the task description. (Show Details)Jan 19 2017, 10:12 PM
Restricted Application added a project: Commons. · View Herald TranscriptAug 12 2017, 12:01 AM