Page MenuHomePhabricator

[Bug] Campaigns do not include .tif images
Closed, ResolvedPublic

Description

We need to add some sort of server side image processing to convert the .tif images into a format that renders in all browsers.

Once that is done, including .tif files is just a matter of adding it to the list of allowed extensions in isa/src/options.js

Example campaign: https://tools.wmflabs.org/isa/campaigns/23

Event Timeline

Astinson renamed this task from [Bug] Campaign not counting the right numer of items to [Bug] Campaign not counting the right number of items, possibly .tif?.
Astinson moved this task from Backlog to Incoming Bugs on the ISA board.

Hi @Astinson, at the moment, we're actually filtering to only use:

  • .png
  • .jpg / .jpeg
  • .svg

Really we need a notice that explains the filtering and shows the count of items excluded or something similar.

We left out tif for version 1 as most browsers do not support displaying them yet. All fixable but needs a conversion step to happen on the server to change it to a .jpg or something.
I'm actually just wondering whether the Commons API has a way of accessing a .jpg preview. If so then it will be a very easy fix. Will investigate further.

Yeah, you should be able to use one of the JPG preview files which can be served from each ".tif" file -- that is what the thumbnails on Wikipedia and other Wikimedia Projects are. @Keegan do you know who would know/where to look?

Yeah, you should be able to use one of the JPG preview files which can be served from each ".tif" file -- that is what the thumbnails on Wikipedia and other Wikimedia Projects are. @Keegan do you know who would know/where to look?

I know the thumb can be called using https://upload.wikimedia.org/wikipedia/commons/thumb but the rest of the URL will vary by MD5 hash.

@Multichill is there a way to call it directly from the API?

Yeah, you should be able to use one of the JPG preview files which can be served from each ".tif" file -- that is what the thumbnails on Wikipedia and other Wikimedia Projects are. @Keegan do you know who would know/where to look?

I know the thumb can be called using https://upload.wikimedia.org/wikipedia/commons/thumb but the rest of the URL will vary by MD5 hash.

@Multichill is there a way to call it directly from the API?

No, but you can use thumb.php, see https://commons.wikimedia.org/w/thumb.php?f=Nordkirchen,_Naturschutzgebiet_Ichterloh_--_2018_--_2131-7.jpg&width=1000

Does this example of using the api work as well:

https://commons.wikimedia.org/wiki/Commons:API/MediaWiki#Retrieve_files_given_a_pair_of_coordinates_(latitude,_longitude)

they appear to be generating the thumbnail from the Commons API
directly. @NavinoEvans

Yeah, you should be able to use one of the JPG preview files which can be served from each ".tif" file -- that is what the thumbnails on Wikipedia and other Wikimedia Projects are. @Keegan do you know who would know/where to look?

I know the thumb can be called using https://upload.wikimedia.org/wikipedia/commons/thumb but the rest of the URL will vary by MD5 hash.

@Multichill is there a way to call it directly from the API?

No, but you can use thumb.php, see https://commons.wikimedia.org/w/thumb.php?f=Nordkirchen,_Naturschutzgebiet_Ichterloh_--_2018_--_2131-7.jpg&width=1000

@Multichill Many thank, that's really useful! It would be the perfect solution, but unfortunately it seems to always show a 404 error when I try it with a .tiff file.
e.g. https://commons.wikimedia.org/w/thumb.php?f=Auditorium%20(Paiporta).TIF&width=500
I've tried about 15 from this category and all the same, so looks like a limitation of that thumbnail api?

@Sadads well spotted, I'll check out whether we get the same field with the API calls we're using. If I can get the direct thumbnail url from title method working it would be ideal though, as it would then match how we get other images.

Thanks for looking into this!

Here's a handy reference table on TIFF image support among various browsers: https://en.wikipedia.org/wiki/Comparison_of_web_browsers#Image_format_support

Thanks for looking into this!

Here's a handy reference table on TIFF image support among various browsers: https://en.wikipedia.org/wiki/Comparison_of_web_browsers#Image_format_support

Many thanks @todrobbins! that's very hand indeed.

NavinoEvans raised the priority of this task from High to Needs Triage.
NavinoEvans renamed this task from [Bug] Campaign not counting the right number of items, possibly .tif? to [Bug] Campaign does not include .tif images?.Feb 29 2020, 2:03 PM
NavinoEvans renamed this task from [Bug] Campaign does not include .tif images? to [Bug] Campaigns do not include .tif images.
NavinoEvans updated the task description. (Show Details)
NavinoEvans moved this task from Incoming Bugs to Backlog on the ISA board.
NavinoEvans moved this task from Backlog to Incoming Bugs on the ISA board.

Hi, sorry for jumping in.

If I read correctly, the tool unfortunately cannot show the TIF images, right? Do we have any workaround at the moment? I tried to start this campaign, but it says "No images found for this campaign!". :)

Thank you.

BeatEstermann raised the priority of this task from Low to High.Feb 19 2022, 7:38 AM
BeatEstermann subscribed.

I am about to put a team of students on testing the ISA Tool in connection with machine vision. Our main institutional partner for this has so far been the ETH Library, which has been friendly enough to provide us with high resolution TIFF images. They are also using entity recognition internally, so they would probably be a good partner to run some benchmarking tests. Unfortunately, we won't be able to run tests on their collections on Commons until this bug is fixed.

Has anybody thought of filing a feature request on the Commons API - there should be a possibility to fetch one of the JPEG previews instead of the original TIFF directly through the API. Why oblige everyone to transform the files on their side if they have already been transformed on the Commons side? - Also, it makes much more sense to offer the possibility of getting lower-size images if the highest resolution is not required.

Cheers,
Beat

Hi all,
Just to let you know this issue is solved :) but I'll need to finish off a patch today. So hopefully a few days to deployment after review is complete.
Many thanks to @Astinson for the suggestions! there were actually a couple of viable methods there, but the easiest one to implement for now is to use the following URL syntax:

https://api.wikimedia.org/core/v1/commons/file/Test.tif

That gives us urls for jpg or png files of different sizes, just what we need.

Change 812849 had a related patch set uploaded (by NavinoEvans; author: NavinoEvans):

[labs/tools/Isa@master] Add TIFF files to allowed file extensions

https://gerrit.wikimedia.org/r/812849

Change 812849 merged by jenkins-bot:

[labs/tools/Isa@master] Add TIFF files to allowed file extensions

https://gerrit.wikimedia.org/r/812849

Change 937085 had a related patch set uploaded (by Sebastian Berlin (WMSE); author: NavinoEvans):

[labs/tools/Isa@m2c-rollback] Add TIFF files to allowed file extensions

https://gerrit.wikimedia.org/r/937085

Change 937085 merged by jenkins-bot:

[labs/tools/Isa@m2c-rollback] Add TIFF files to allowed file extensions

https://gerrit.wikimedia.org/r/937085