Page MenuHomePhabricator

Some uploaded image files show "_error" in image metadata section
Open, LowPublicBUG REPORT

Description

List of steps to reproduce (step by step, including full links if applicable):

What happens?:
It shows the field "_error" with value 0 (zero). I've noticed this on some other images as well. I'll link any more if I come across them.

Screenshot 2021-07-10 at 22.49.40.png (326×1 px, 61 KB)

What should have happened instead?:
As the metadata doesn't seem to actually contain any field named "_error", I'd expect not to see this.

Bald_mountain.jpg (819×1 px, 385 KB)

Bald mountain.jpg, photo by Hank Rogers, fair use for the purpose of fixing a possible bug.

Event Timeline

Aklapper renamed this task from _error in image metadata to Some image files on Commons show "_error" in image metadata section.Jul 10 2021, 7:56 PM
$:acko\> wget https://upload.wikimedia.org/wikipedia/en/c/c0/Bald_mountain.jpg
$:acko\> exif Bald_mountain.jpg 
Corrupt data
The data provided does not follow the specification.
ExifLoader: The data supplied does not seem to contain EXIF data.
$:acko\> exiftool -v Bald_mountain.jpg 
  ExifToolVersion = 12.26
  FileName = Bald_mountain.jpg
  Directory = .
  FileSize = 394531
  FileModifyDate = 1412682693
  FileAccessDate = 1625946728
  FileInodeChangeDate = 1625946613
  FilePermissions = 33204
  FileType = JPEG
  FileTypeExtension = JPG
  MIMEType = image/jpeg
JPEG APP0 (14 bytes):
  + [BinaryData directory, 9 bytes]
  | JFIFVersion = 1 1
  | ResolutionUnit = 1
  | XResolution = 96
  | YResolution = 96
  | ThumbnailWidth = 0
  | ThumbnailHeight = 0
JPEG DQT (65 bytes):
JPEG DQT (65 bytes):
JPEG SOF0 (15 bytes):
  ImageWidth = 1024
  ImageHeight = 819
  EncodingProcess = 0
  BitsPerSample = 8
  ColorComponents = 3
  YCbCrSubSampling = 1 1
JPEG DHT (29 bytes):
JPEG DHT (73 bytes):
JPEG DHT (26 bytes):
JPEG DHT (52 bytes):
JPEG SOS

The _error field was newly introduced as part of T275268: Address "image" table capacity problems by storing pdf/djvu text outside file metadata.

I don't remember how metadata errors were rendered previously, though. If previously it simply said there was no metadata, then this might be considered an improvement as it at least surfaces the issue. Although it'd be better if it did so through a localised message rendered instead of the table.

AlexisJazz renamed this task from Some image files on Commons show "_error" in image metadata section to Some uploaded image files show "_error" in image metadata section.Jul 11 2021, 1:50 AM

The _error field was newly introduced as part of T275268: Address "image" table capacity problems by storing pdf/djvu text outside file metadata.

I don't remember how metadata errors were rendered previously, though. If previously it simply said there was no metadata, then this might be considered an improvement as it at least surfaces the issue. Although it'd be better if it did so through a localised message rendered instead of the table.

How about a hidden maintenance category? I've changed the title btw as Bald mountain.jpg is actually not on Commons (it's on enwiki), but I assume this could happen on any wiki.

It doesn't seem to be major issue (in a sense that it would block further roll out img_metadata refactor) but if you need me to work on it, I can spend some time on it once the big parts are out of the way.

It doesn't seem to be major issue (in a sense that it would block further roll out img_metadata refactor) but if you need me to work on it, I can spend some time on it once the big parts are out of the way.

That would be nice. I personally think a hidden maintenance category would be the most helpful so files with bad metadata can be searched/filtered for by those who might be interested in fixing it. The table could be omitted in that case as your average end user doesn't need to bothered/confused by possibly corrupt metadata.

The problem with trying to "fix" bad metadata is that the only fix is usually removing it. A maintenance category would quickly become unusable.

The problem with trying to "fix" bad metadata is that the only fix is usually removing it. A maintenance category would quickly become unusable.

It might be usable if one wants to search for files with bad metadata by a particular uploader (provided the files are also categorized in an uploader category) or from a particular source that has some fixable issue. It can also indicate some upload error, in the example of Bald mountain.jpg the uploader probably uploaded the 1024px thumbnail from Flickr. Uploading the original may fix the issue, though in this particular case that's not an option because no source link was provided and the license seems dubious. But otherwise that would be an option.

If those are bad excuses, no error should be shown either. If it can't be fixed there's no point in making end users worry about it.

At any rate it would absolutely be no worse than https://en.wikipedia.org/wiki/Category:Files_with_no_machine-readable_description with 148,361 files.