Page MenuHomePhabricator

mw.Title.newFromImg does not handle fullsize images with "-thumbnail" in the file name
Closed, ResolvedPublic

Description

Steps to reproduce:

Expected: File:Hovercards-thumbnail.png
Actual: File:0a

Breaks MediaViewer.

Event Timeline

newFromImg uses the first matching one from four regexes:

				// Thumbnails
				/\/[a-f0-9]\/[a-f0-9]{2}\/([^\s\/]+)\/[^\s\/]+-[^\s\/]*$/,

				// Thumbnails in non-hashed upload directories
				/\/([^\s\/]+)\/[^\s\/]+-(?:\1|thumbnail)[^\s\/]*$/,

				// Full size images
				/\/[a-f0-9]\/[a-f0-9]{2}\/([^\s\/]+)$/,

				// Full-size images in non-hashed upload directories
				/\/([^\s\/]+)$/

and the second one matches on https://upload.wikimedia.org/wikipedia/commons/0/0a/Hovercards-thumbnail.png. Apparently, filenames ending in -thumbnail (which is also used by the multimedia code to replace overly long filenames) was not expected. Probably /thumb/ should be included somewhere in the thumbnail regexes (or for extra points fetch the prefix from the filerepoinfo API since it can be customized).

Jdlrobson triaged this task as Medium priority.May 10 2016, 5:13 PM
Jdlrobson added a subscriber: Jdlrobson.

@Tgr how many pages do you estimate this is impacting?

Commons images matching %-thumbnail.% are used on about 5000 pages:

mysql:research@analytics-store.eqiad.wmnet [commonswiki]> select count(*) from globalimagelinks where gil_to like '%-thumbnail.%';
+----------+
| count(*) |
+----------+
|     4809 |
+----------+
1 row in set (4 min 53.21 sec)

There is no easy way to query local image usage across projects but it's probably the same magnitude or less.

matmarex renamed this task from mw.Title.newFromImg does not handle fullsize images to mw.Title.newFromImg does not handle fullsize images with long titles.May 31 2016, 10:16 PM
matmarex renamed this task from mw.Title.newFromImg does not handle fullsize images with long titles to mw.Title.newFromImg does not handle fullsize images with "-thumbnail" in the file name.May 31 2016, 10:19 PM
matmarex added subscribers: Josve05a, matmarex.

Change 292051 had a related patch set uploaded (by Bartosz Dziewoński):
mw.Title: Correct order of file URL regexes in newFromImg

https://gerrit.wikimedia.org/r/292051

Change 292051 merged by jenkins-bot:
mw.Title: Correct order of file URL regexes in newFromImg

https://gerrit.wikimedia.org/r/292051