Page MenuHomePhabricator

Thumbnail 404s get cached
Open, MediumPublic

Description

Steps to reproduce:

  1. get a thumbnail URL for a Commons-hosted image on beta, e.g.

http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&prop=imageinfo&format=jsonfm&titles=File:Giant_planes_comparison.svg&iiprop=url&iiurlwidth=129

will return

"thumburl": "http://upload.beta.wmflabs.org/wikipedia/en/images/thumb/5/52/Giant_planes_comparison.svg/129px-Giant_planes_comparison.svg.png",

  1. change the width to something else, open that URL, e.g.

http://upload.beta.wmflabs.org/wikipedia/en/images/thumb/5/52/Giant_planes_comparison.svg/137px-Giant_planes_comparison.svg.png

will return "The source file 'Giant_planes_comparison.svg' does not exist."

  1. repeat the API call with the new width.
  1. open the thumbnail URL again

Expected result: the URL in step 4 should work, as the API request in step 3 is supposed to generate it. (I would also expect the one in step2 to work, since 404 handling is enabled, but that's arguable and probably a different bug.)

Actual result: the URL in step 4 returns the same error message as in step 2.

(Not sure if this is a beta configuration issue, or something related to Varnish, or a MediaWiki bug. I tried reproducing locally, but I get even more broken behavior that way, thumbnail URLs never work. That might be bug 54202, although the error message is different.)


Version: unspecified
Severity: normal

Details

Reference
bz67056

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:37 AM
bzimport set Reference to bz67056.
bzimport added a subscriber: Unknown Object (MLST).
Tgr set Security to None.

The bug only affects beta upload URLs (upload.beta.wmflabs.org) not production ones (upload.wikimedia.org). I can still reproduce it with such URLs. There was a copy-paste error in the description, now fixed; sorry about that.

I have no idea what decides whether the api returns a beta or a production URL; it seems to be random. I think the beta one is correct since beta enwiki uses $wgUseInstantCommons and that has thumbnail caching enabled.

I think this is not beta-related but a bug (or quirk) of how thumb.php works.

What happens is basically that thumb.php does not do remote file fetches (T27958); it returns a 404 instead, and that 404 gets cached in varnish. Afterwards the API is queried for the same file, fetches it and generates the thumbnail, but at this point the 404 is already cached in varnish.

This does not affect production because it has not remote repos configured (Commons is a shared-DB repo, and that behaves as a local repo as far as file handling is concerned).

There is still the issue of the API sometimes returning a local and sometimes a remote thumbnail; compare

http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&prop=imageinfo&format=jsonfm&titles=File:Giant_planes_comparison.svg&iiprop=url&iiurlwidth=129
http://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&prop=imageinfo&format=jsonfm&titles=File:Giant_planes_comparison.svg&iiprop=url&iiurlwidth=130

Maybe something got stuck in memcached or varnish?

thcipriani triaged this task as Medium priority.Jul 13 2015, 7:29 PM
thcipriani moved this task from To Triage to Backlog on the Beta-Cluster-Infrastructure board.

I'm wondering about moving this from Beta-Cluster-Infrastructure to Beta-Cluster-reproducible... But I did spot this in /data/project/upload7/scripts/thumb-handler.php (the script nginx on deployment-upload runs to handle upload.beta.wmflabs.org requests):

} elseif ( $httpCode == 404 ) {
        header( 'HTTP/1.1 404 Not found' );
        // header( 'Cache-Control: no-cache' );
        header( 'Cache-Control: s-maxage=300, must-revalidate, max-age=0' );