Page MenuHomePhabricator

EXIF location data of image file not imported to MediaWiki File information (Upload with UploadWizard)
Open, Needs TriagePublic

Description

The EXIF location of some files uploaded with the UploadWizard got not added to the MediaWiki file metadata (and information template).

This only appears to some files of the same upload queue in the UploadWizard.

Purging the page dose not help.

Example files:
Not added: https://commons.wikimedia.org/wiki/File:Eckerlochstieg_19.jpg
Added correct: https://commons.wikimedia.org/wiki/File:Eckerlochstieg_20.jpg

Not added: https://commons.wikimedia.org/wiki/File:Bodebruch_in_April_2019_17.jpg
Added correct: https://commons.wikimedia.org/wiki/File:Bodebruch_in_April_2019_16.jpg

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Confirming. Both https://commons.wikimedia.org/wiki/File:Eckerlochstieg_19.jpg and https://commons.wikimedia.org/wiki/File:Eckerlochstieg_20.jpg have the same values. Using ImageMagick-6.9.10.28-1:

$:acko\> identify -verbose ~/Eckerlochstieg_19.jpg | grep "exif:GPS"
    exif:GPSAltitude: 45981/50
    exif:GPSAltitudeRef: 0
    exif:GPSDateStamp: 2019:04:02
    exif:GPSInfo: 29522
    exif:GPSLatitude: 51/1, 47/1, 10803/500
    exif:GPSLatitudeRef: N
    exif:GPSLongitude: 10/1, 37/1, 5091/500
    exif:GPSLongitudeRef: E
    exif:GPSVersionID: 2, 3, 0, 0

Little update: In the first report only some files where affected. Now this is happening to all of my files. (of course only of them having GPS data in the EXIF) Example

I was surprised to see that coordinates appear to be missing from all my uploads nowadays. Was this done intentionally? I see there was some talk about allowing to purge coordinate data at T218057: Determine workflow to selectively purge potentially privacy-sensitive EXIF fields, such as geocoordinates, from a Wikimedia Commons file etc.

That's nice, but removing all such data without warning is quite a disaster: it means I cannot use UploadWizard any longer for most of my uploads, and worse I should re-do all my recent hundreds of uploads to recover the coordinates. Had I known before, I would have used Vicuña...

Correction: the GPS data is not actually stripped from EXIF (although the metadata panel doesn't show it either); it's just not imported into the wikitext, as this bug report says.

For instance https://commons.wikimedia.org/wiki/File:2018-05-11_Joensuu_station_4.jpg has, once the original is downloaded:

$ exiftool 2018-05-11_Joensuu_station_4.jpg | grep GPS
GPS Version ID                  : 2.3.0.0
GPS Latitude Ref                : North
GPS Longitude Ref               : East
GPS Time Stamp                  : 19:21:06
GPS Status                      : Measurement Active
GPS Measure Mode                : 2-Dimensional Measurement
GPS Dilution Of Precision       : 0
GPS Map Datum                   : WGS-84
GPS Processing Method           : GPS
GPS Area Information            :
GPS Date Stamp                  : 2018:05:16
GPS Date/Time                   : 2018:05:16 19:21:06Z
GPS Latitude                    : 62 deg 35' 58.93" N
GPS Longitude                   : 29 deg 46' 30.66" E
GPS Position                    : 62 deg 35' 58.93" N, 29 deg 46' 30.66" E

A detailed description in German can be found here: https://commons.wikimedia.org/wiki/Commons_talk:Hochladen#GPS-Daten_werden_nicht_mehr_automatisch_%C3%BCbernommen? . Until some time ago it also worked for EXIF data - Panasonic - DMC-TZ61 Version ID: 2.3.0.0 without problems.

For me this is a very important function.

  • does not work: EXIF data - Panasonic - DMC-TZ61 Version ID: 2.3.0.0
  • works: EXIF data - Pixel 4 XL - Version ID: 2.2.0.0

@dschwen usually your bot picks up missing coordinates ( https://commons.wikimedia.org/w/index.php?title=Special:Contributions/DschwenBot&dir=prev&offset=20200806083129&target=DschwenBot ). Any idea why it didn't do it for these example files? I assume it is because you look at the exif info from the database and not from the raw file?

Ugh, right, but it only downloads promising candidates, i.e. files which have some GPS stuff in the metadata in the database https://github.com/Commonists/gpsexifbot/blob/master/gps_exif_bot2.py#L59-L63

Why do I even bother downloading the image an analyzing it with exiv2 you might ask? If I had a timemachine to go back five years I could ask myself. I think the data in the DB was not usable/complete when the bot was developed.

I checked the specs at http://www.cipa.jp/std/documents/e/DC-008-2012_E.pdf and looks like exif is at the start of the file. I wonder if you can speed up your bot by just downloading the first part of the file instead of the whole file.

Hello dschwen and Multichill, thank you for dealing with this topic. I recently bought a NIKON CORPORATION: COOLPIX P1000. It works with version ID: 2.3.0.0. Just like my Panasonic: DMC-TZ61. Amazing for me: the UploadWizard processes the images of the NIKON CORPORATION: COOLPIX P1000 without any problems: see:

https://commons.wikimedia.org/wiki/File:20200806_xl_0033-solarpark-an-der-b1-bei-Ruedersdorf-ortsteil-tasdorf.jpg

but still no pictures of the Panasonic: DMC-TZ61:

https://commons.wikimedia.org/wiki/File:20170910_xl_P1150222-windkraftanlagen--wka--im-windpark--werder-zinndorf--rehfelde-amt-maerkische-schweiz.jpg

At first I thought the problem was solved. But unfortunately not yet. I would be very happy if there would be a solution soon.

warmest regards Molgreen