Page MenuHomePhabricator

upload.wikimedia.org HTTP 304 responses lack a Content-Type header
Closed, ResolvedPublic

Description

Here is the show case.

Works properly most of the time, for example here:

$ curl -I https://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/De_Rotterdam%2C_September_2019_-_02.jpg/360px-De_Rotterdam%2C_September_2019_-_02.jpg
HTTP/2 200 
date: Wed, 07 Oct 2020 21:14:44 GMT
content-type: image/jpeg
content-length: 70536
last-modified: Wed, 11 Sep 2019 03:32:21 GMT
etag: 6ea8427bb3c0df7b880989b9022f5428
x-timestamp: 1568172740.52202
server: ATS/8.0.8
age: 47424
x-cache: cp3055 hit, cp3055 hit/9
x-cache-status: hit-front
server-timing: cache;desc="hit-front"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 86400, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 86400, "failure_fraction": 0.05, "success_fraction": 0.0}
x-client-ip: 2a02:168:6008:0:e998:bc9d:2fa5:b439
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *
accept-ranges: bytes

$ curl --header 'If-None-Match: 6ea8427bb3c0df7b880989b9022f5428' -I https://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/De_Rotterdam%2C_September_2019_-_02.jpg/360px-De_Rotterdam%2C_September_2019_-_02.jpg
HTTP/2 304 
date: Wed, 07 Oct 2020 21:14:44 GMT
content-type: image/jpeg
last-modified: Wed, 11 Sep 2019 03:32:21 GMT
etag: 6ea8427bb3c0df7b880989b9022f5428
x-timestamp: 1568172740.52202
server: ATS/8.0.8
age: 47433
x-cache: cp3055 hit, cp3055 hit/10
x-cache-status: hit-front
server-timing: cache;desc="hit-front"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 86400, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 86400, "failure_fraction": 0.05, "success_fraction": 0.0}
x-client-ip: 2a02:168:6008:0:e998:bc9d:2fa5:b439
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *

But for this specific image/request the HTTP response header content-type is missing:

$ curl -I https://upload.wikimedia.org/wikipedia/commons/thumb/9/94/CIA_map_of_Central_America.png/490px-CIA_map_of_Central_America.png
HTTP/2 200 
date: Thu, 08 Oct 2020 08:02:57 GMT
content-type: image/png
content-length: 328121
accept-ranges: bytes
last-modified: Sun, 20 May 2018 15:52:08 GMT
etag: 122f752850864671485c9d915f0feca2
x-timestamp: 1526831527.13817
server: ATS/8.0.8
age: 0
x-cache: cp3055 hit, cp3055 pass
x-cache-status: hit-local
server-timing: cache;desc="hit-local"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 86400, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 86400, "failure_fraction": 0.05, "success_fraction": 0.0}
x-client-ip: 2a02:168:6008:0:e998:bc9d:2fa5:b439
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *

$ curl --header 'If-None-Match: 122f752850864671485c9d915f0feca2' -I https://upload.wikimedia.org/wikipedia/commons/thumb/9/94/CIA_map_of_Central_America.png/490px-CIA_map_of_Central_America.png
HTTP/2 304 
date: Thu, 08 Oct 2020 10:26:20 GMT
etag: 122f752850864671485c9d915f0feca2
server: ATS/8.0.8
age: 0
x-cache: cp3055 hit, cp3055 pass
x-cache-status: hit-local
server-timing: cache;desc="hit-local"
strict-transport-security: max-age=106384710; includeSubDomains; preload
report-to: { "group": "wm_nel", "max_age": 86400, "endpoints": [{ "url": "https://intake-logging.wikimedia.org/v1/events?stream=w3c.reportingapi.network_error&schema_uri=/w3c/reportingapi/network_error/1.0.0" }] }
nel: { "report_to": "wm_nel", "max_age": 86400, "failure_fraction": 0.05, "success_fraction": 0.0}
x-client-ip: 2a02:168:6008:0:e998:bc9d:2fa5:b439
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
timing-allow-origin: *

Event Timeline

in s3 image downloading we use a header in request options namely 'If-none-match', for converting jpeg/png images to webp format for optimisation purposes we need to know the mime-type of the image being fetched which we can't find here as mentioned above by kelson, due to this reason I have to use a fallback option relying on file name instead of mime/type which might cause problems in some cases where mime/type doesn't match the extension of file being downloaded.

Krinkle subscribed.

This is a WMF-specific behaviour in our Thumbor-based set up and edge routing. Not influenced by MediaWiki's file handling.

@Krinkle I have rechecked this bug/ticket with the given example and now it works. Might that be that the bug has been fixed?

Yep, it would appear so. I suspect this is likely a bug in ATS, and indeed specific to how it generates HTTP 304 responses. I do note that much of this infrastructure has changed since last year, including a frontend layer that now uses HAProxy in front of Varnish instead of ATS. There has also been an update to Swift and Thumbor since then.

I'm not sure the bug is gone yet though. I'm finding that images that have remained in the caches for a while indeed now respond with content-type in HTTP 304. But.. for any images I purged (including your examples), there is now actually no longer any ETag response header, and thus it isn't possible to get a 304 response in the first place.

Let's conitinue that issue under T256217 and T295556, and we can re-open this if the bug comes back.

Krinkle renamed this task from HTTP Mime-Type now always returned properly if "If-None-Match" request header used to upload.wikimedia.org HTTP 304 responses lack a Content-Type header.May 26 2022, 2:59 PM