Page MenuHomePhabricator

TIFF images acquiring an extra page
Closed, ResolvedPublic

Description

broken out from T87236 where the main issue was merged
Upon uploading some single-page TIFF files, which display correctly in both Gimp an ImageMagick (display) on my Ubuntu machine, the resulting file on Commons is interpreted as a two-page TIFF.

Per @Bawolff in the original task: "Maybe a bug in tiffinfo. Its probably a "thumbnail" image being picked up as an extra page."

Example: File:Målning. Walther von Hallwyl av Nils Asplund - Hallwylska museet - 81673.tif

Event Timeline

Lokal_Profil raised the priority of this task from to Needs Triage.
Lokal_Profil updated the task description. (Show Details)
Lokal_Profil added a project: Multimedia.
Lokal_Profil added subscribers: Lokal_Profil, Bawolff.

tiffinfo reports two: TIFF Directory at offset . . . . .

Of note is that the file also seems to have a RichTIFFIPTC, which caused problems with: T87042

Lokal_Profil added a subscriber: Jdforrester-WMF.

@Jdforrester-WMF:
How can this not go under project Multimedia (specifically Storage and description)?

Also removing that made this valid bug projectless. If not Multimedia then what is the proper project to use for Multimedia related bugs in MediaWiki?

@Jdforrester-WMF:
How can this not go under project Multimedia (specifically Storage and description)?

The scope of the Multimedia team is described on its project page. This is outside of the scope of that team.

Also removing that made this valid bug projectless.

It's still under Commons. It had two team projects, now it has one.

If not Multimedia then what is the proper project to use for Multimedia related bugs in MediaWiki?

The corresponding code project is the best fit. Adding MediaWiki-extensions-PagedTiffHandler here as I cannot find a better fit...

The scope of the Multimedia team is described on its project page. This is outside of the scope of that team.

[I'm aware this is probably not the right place to discuss this]

Perhaps the multimedia team/project should rename itself to better describe what they work on. It will only confuse users if the multimedia team is only taking responsibility for a subset of the multimedia related issues in Wikimedia/MediaWiki. [To be clear, I don't have a problem with them only doing a subset. Nobody can do everything, and multimedia is a small team. But calling the project multimedia, when its not covering multimedia, is confusing].

The corresponding code project is the best fit. Adding MediaWiki-extensions-PagedTiffHandler here as I cannot find a better fit...

It is indeed a bug in the PagedTiffHandler extension. According to https://phabricator.wikimedia.org/project/profile/48/ , that makes it #Reading-Infrastructure-Team 's problem. Which seems rather random and illogical when according to the same page, Multimedia is responsible for the exact same type of code when its in core ( MediaWiki-DjVu for DjVu files, MediaWiki-File-management is a catchall for the image thumbnailing code of anything that's not an extension).

The truth is probably that nobody at WMF wants to be responsible for it. Which is fine. But its confusing how Multimedia means, things the WMF multimedia team wants to work on, and not, things wrong with multimedia, especially given the UI projects are presented with.

It's still under Commons. It had two team projects, now it has one.

Well that happened because I added Commons on it while leaving a comment.

Perhaps renaming Multimedia as #Multimedia-team would prevent such issues in the future? Multimedia could then possibly serve as a triage area where teams can claim (or not) bugs. Meaning that non-WMF don't have to venture into the jungle described in @Bawolff's comment.

Micha raised the priority of this task from Low to Medium.Nov 15 2015, 8:36 AM

This is important for GLAM. If it is just "low" it will never be solved.

Then just fix it! Every GLAM which we work with by Wiki-GLAM coorperations are reporting that issue! This is not a good advertisement for WMF when you let wait the GLAM responsibles.

See the other problem we have with TIFF thumbnailing: https://phabricator.wikimedia.org/T118679
Maybe it is caused by the same bug.

See the other problem we have with TIFF thumbnailing: https://phabricator.wikimedia.org/T118679
Maybe it is caused by the same bug.

They have different causes.

Then just fix it! Every GLAM which we work with by Wiki-GLAM coorperations are reporting that issue! This is not a good advertisement for WMF when you let wait the GLAM responsibles.

If you think WMF should spend more resources on issues that affect GLAMs, this is not the place to argue that (Don't get me wrong, I'm sympathetic, just that this is the wrong place, and the powers that be aren't going to see the arguments if they're just on some random bug). Try Wikimedia-l.

While messing with priorites will not get the bug fixed faster, statements like "Every GLAM which we work with by Wiki-GLAM coorperations are reporting that issue" might actually help a little bit, because previously I assumed this was a cosmetic issue that needs to be fixed but didn't really bother anyone, not the type of issue that people are actually complaining about.

Change 253280 had a related patch set uploaded (by Brian Wolff):
Convert hex value to decimal. Don't just cast to int.

https://gerrit.wikimedia.org/r/253280

Change 253280 merged by jenkins-bot:
Convert hex value to decimal. Don't just cast to int.

https://gerrit.wikimedia.org/r/253280

Will existing uploads also be affected by this patch? If not would it be possible to do a re-render of any multi-page tiffs (I would expect most, but not all, to actually be single-page.

Bawolff claimed this task.

Fix will go live at commons on Wednesday

Will existing uploads also be affected by this patch? If not would it be possible to do a re-render of any multi-page tiffs (I would expect most, but not all, to actually be single-page.

Existing files probably won't be fixed until their metadata is flushed. This generally only happens if someone does ?action=purge to the image description page.

There is currently 35,067 tiff files on commons with more than 1 page (~7% of all tiff files). That's quite a few, but I guess still within the range where its reasonable to just have a script to purge them all.

Existing files probably won't be fixed until their metadata is flushed. This generally only happens if someone does ?action=purge to the image description page.

There is currently 35,067 tiff files on commons with more than 1 page (~7% of all tiff files). That's quite a few, but I guess still within the range where its reasonable to just have a script to purge them all.

If possible I think such a script could be useful. I would expect that the majority of the multi page files are actually single page. I know that I'm probably responsible for 15,000-25,000 of them which are all single page.

Micha raised the priority of this task from Low to Needs Triage.

I checked the file. It has 3 pages. MediaWiki reports it has 3 pages. I'm not sure I see the issue.

If you're complaining about the thumbnails not showing up, that would be T117349. This bug is only about having the wrong number of pages show up. (Although I admit, I was hoping that recent changes would have fixed that for this file. Seems like that didn't happen)

I checked the file. It has 3 pages. MediaWiki reports it has 3 pages. I'm not sure I see the issue.

If you're complaining about the thumbnails not showing up, that would be T117349. This bug is only about having the wrong number of pages show up. (Although I admit, I was hoping that recent changes would have fixed that for this file. Seems like that didn't happen)

Note, that ?action=purge might cause the image to be re-rendered from scratch, so for now, I recommend not accessing that image with ?action=purge at the end