Page MenuHomePhabricator

File displaying invalid metadata in Commons, when the original Exif seems fine.
Closed, ResolvedPublic

Description

The file https://commons.wikimedia.org/wiki/File:Jena_-_Hummelsberg_05.jpg has invalid values in metadata such as "ennickental" for Camera Manufacturer. In the original file it's "KONICA MINOLTA". Purging the file doesn't help.

Similar to T97253, but that's already resolved.

Event Timeline

Ghouston created this task.Sep 18 2016, 4:48 AM
Restricted Application added subscribers: Poyekhali, Aklapper. · View Herald TranscriptSep 18 2016, 4:48 AM

Hmm, maybe that fix isn't on Commons yet: T140419

Ghouston reopened this task as Open.Oct 7 2016, 9:32 AM

I've reopened this, since T140419 is supposedly fixed since August. I've tried purging the image description page, which according to https://www.mediawiki.org/wiki/Manual:File_metadata_handling is supposed to read the metadata again, but the metadata is still corrupt.

I'm pretty sure that documentation page is wrong. Cached metadata is only purged when uploading a new version of a file, or when running the refreshImageMetadata.php maintenance script (T32961). I'll try to get someone to purge it manually with an SQL query, to confirm this is fixed before we proceed with T32961.

For completeness: cached metadata is also purged when it is detected to be invalid, which practically never happens (the exact rules are somewhat murky and depend on file type). We're actually going to try to set the metadata to an invalid value, which MediaWiki will then purge.

To manually purge the cached bad metadata for this file:

update image set img_metadata='' where img_name = "Jena_-_Hummelsberg_05.jpg";
Restricted Application added a project: Multimedia. · View Herald TranscriptOct 7 2016, 4:42 PM

Mentioned in SAL (#wikimedia-operations) [2016-10-07T16:53:31Z] <jynus> testing img_metadata nuking for T145953 and T147015 (backups on neodymium)

Restricted Application added a subscriber: Matanya. · View Herald TranscriptOct 7 2016, 4:53 PM
MariaDB MARIADB s4-master commonswiki > UPDATE image SET img_metadata='' WHERE img_name = 'Jena_-_Hummelsberg_05.jpg';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

MariaDB MARIADB s4-master commonswiki > UPDATE image SET img_metadata='' WHERE img_name = '20160927_St_George''s_Church_(The_Winery)_Mohegan_Lake_2.jpg';
Query OK, 1 row affected (0.01 sec)
Rows matched: 1  Changed: 1  Warnings: 0
matmarex closed this task as Resolved.Oct 7 2016, 5:08 PM
matmarex claimed this task.

The metadata got re-generated as expected (and with correct data) after @jcrespo purged it and I viewed the file pages.

Bawolff added a subscriber: Bawolff.Oct 7 2016, 5:34 PM

For reference, purging stopped updating image metadata as of 9120ee007ae32

Thanks for fixing that file, but there are many others in https://commons.wikimedia.org/wiki/Category:Invalid_equipment_in_Exif (some are genuinely invalid, others are probably caused by this bug), and more are found all the time.

@Ghouston This is known. I am not directly involved with this, but I believe the plan is now to run a task to do the same on all invalid ones- this was a proof of concept/test of the fix. I cannot remember on which task that is going to be coordinated.

Yes, the task for that is T32961: Run refreshImageMetadata.php --force. I'm working to make it happen, see that task for details.