Page MenuHomePhabricator

Some thumbnail images delivered with wrong application/x-www-form-urlencoded mime-type
Open, HighPublic

Description

This happens on various thumbnail sizes of https://www.mediawiki.org/wiki/File:Page_Schemas_edit_schema_screenshot.png (not all of them).

Clicking on some of them, the browser prompts to download the file instead of displaying it. The cause is them being delivered with a application/x-www-form-urlencoded content-type instead of image/png

$ curl -s -D - -o /dev/null "https://upload.wikimedia.org/wikipedia/mediawiki/4/43/Page_Schemas_edit_schema_screenshot.png"
HTTP/1.1 200 OK
Date: Sat, 03 Mar 2018 17:15:19 GMT
Content-Type: application/x-www-form-urlencoded
Content-Length: 59510
Connection: keep-alive
X-Object-Meta-Sha1Base36: 3lunouryyip6gpflvztb2p20l0nfs5k
Last-Modified: Tue, 30 May 2017 11:07:47 GMT
Etag: 5db3c2a1fee3897a8325c76b7d46ecd9
X-Timestamp: 1496142466.35998
X-Content-Dimensions: 1206x1349:1
X-Trans-Id: tx1ffbc88de8d04285bddbe-005a9ad71b
X-Varnish: 721112451, 61187040 65427943, 466571330
Via: 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)
Accept-Ranges: bytes
Age: 268
X-Cache: cp1073 pass, cp3034 hit/2, cp3039 miss
X-Cache-Status: hit-local
Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
X-Analytics: https=1;nocookies=1
X-Client-IP: 83.39.35.95
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
Timing-Allow-Origin: *

Details

Related Gerrit Patches:

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptMar 3 2018, 5:20 PM
Restricted Application added projects: Commons, Multimedia. · View Herald TranscriptMar 3 2018, 6:04 PM
Ramsey-WMF assigned this task to Cparle.Mar 9 2018, 6:43 PM
Ramsey-WMF added subscribers: Cparle, Ramsey-WMF.

Temporarily assigning to @Cparle since he resolved T173276 , which may be the same issue.

Fixed for the image in the ticket description, but not for the Swedish flag ...

Ramsey-WMF triaged this task as Low priority.Mar 21 2018, 3:47 PM
Ramsey-WMF moved this task from Untriaged to Next up on the Multimedia board.

I've moved the thumbnails problem you raised @Peter into T190701 - it's a bit different cos the file mentioned in this ticket it seems that it was the original that was stored incorrectly, where in your one it's the thumbnails ... and in any case the repair script that works for the file raised in this ticket doesn't work for the Swedish flag

@Ciencia_Al_Poder can you confirm that the file you raised this ticket about works ok now?

@Ciencia_Al_Poder can you confirm that the file you raised this ticket about works ok now?

It works now, thanks! However, clicking on the first version of that file gives the same problem. A minor thing I guess

Cparle added a comment.Apr 3 2018, 3:03 PM

I raised a separate ticket for the revisions, see T191306

Ok to close this one?

Ciencia_Al_Poder closed this task as Resolved.Apr 3 2018, 4:13 PM

Of course!

Ciencia_Al_Poder reopened this task as Open.Apr 19 2019, 9:12 AM

I came across another one: https://www.mediawiki.org/wiki/File:Mscatselect_1.jpg

The current image (original size) has the same problem. content-type: application/x-www-form-urlencoded

https://upload.wikimedia.org/wikipedia/mediawiki/a/a1/Mscatselect_1.jpg

Time to run the script again, I guess

Aklapper renamed this task from Some images delivered with wrong application/x-www-form-urlencoded mime-type to Some thumbnail images delivered with wrong application/x-www-form-urlencoded mime-type.Jul 31 2019, 1:11 AM
Aklapper removed Cparle as the assignee of this task.
Aklapper edited projects, added Thumbor; removed Multimedia-Team-Working-Board.
Aklapper added subscribers: Vort, EdJoPaTo, MBH, IKhitron.
Shizhao added a subscriber: Shizhao.Aug 1 2019, 3:21 AM

This thumb same problem: https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Shin-marunouchi.Building-2007-01.jpg/100px-Shin-marunouchi.Building-2007-01.jpg

curl -s -D - -o /dev/null https://upload.wikimedia.org/wikipedia/commons/thumb/8/8a/Shin-marunouchi.Building-2007-01.jpg/100px-Shin-marunouchi.Building-2007-01.jpg
HTTP/2 200
date: Thu, 01 Aug 2019 03:06:57 GMT
content-type: application/x-www-form-urlencoded
content-length: 5017
x-object-meta-sha1base36: ml1bi0n9rwgybi6gge8be7ggo6no1mb
last-modified: Wed, 09 Mar 2016 03:01:45 GMT
x-timestamp: 1457492504.86118
x-trans-id: tx00b1bb24df2d41d794b96-005d4209f1
etag: 8d255785cf5f17a60578d9accec77f2a
server: ATS/8.0.3
x-varnish: 478659828 365867875
age: 19808
x-cache: cp1076 hit, cp1076 hit/3
x-cache-status: hit-front
server-timing: cache;desc="hit-front"
strict-transport-security: max-age=106384710; includeSubDomains; preload
x-analytics: https=1;nocookies=1
x-client-ip: 172.16.7.167
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
timing-allow-origin: *
accept-ranges: bytes

This bug caused some images to be blocked by Cross-Origin Read Blocking (CORB) in Chrome 76 - see https://bugs.chromium.org/p/chromium/issues/detail?id=990853#c2

How easy or difficult would it be to fix wikimedia to send the correct Content-Type response header for images (e.g. image/png rather than application/x-www-form-urlencoded) ?

FWIW, in my previous comment I was hoping for a systematic fix (rather than having wikimedia editors have to fix images one-by-one). Do we know why some images are served with this weird application/x-www-form-urlencoded content type? I would normally associate application/x-www-form-urlencoded with http POST *requests* rather than with http *responses*...

ema raised the priority of this task from Low to High.Aug 10 2019, 9:31 AM
ema added a subscriber: ema.

Priority set to High as images are not displayed correctly due to this. I see the bug happening right now on https://upload.wikimedia.org/wikipedia/commons/thumb/c/cb/Logo_European_Central_Bank.svg/150px-Logo_European_Central_Bank.svg.png

< HTTP/2 200 
< date: Sat, 10 Aug 2019 09:27:13 GMT
< content-type: application/x-www-form-urlencoded
< content-length: 7145
< x-object-meta-sha1base36: 76at8zaf4c2ioc8j1jehlxl8med6u6j
< last-modified: Mon, 09 Oct 2017 15:19:26 GMT
< etag: ab7bd990981ebb7b3c8faa7ba460ace4
< x-timestamp: 1507562365.00999
< x-trans-id: tx9fd74dae875c482396356-005d4db8b5
< server: ATS/8.0.3
< x-varnish: 179129756 179942128
< age: 54587
< x-cache: cp3047 hit, cp3038 hit/8
< x-cache-status: hit-front
< server-timing: cache;desc="hit-front"
< strict-transport-security: max-age=106384710; includeSubDomains; preload
< x-analytics: https=1;nocookies=1
< x-client-ip: 89.14.184.242
< access-control-allow-origin: *
< access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
< timing-allow-origin: *
< accept-ranges: bytes

Wang_Qiliang added a subscriber: Wang_Qiliang.EditedAug 14 2019, 2:05 PM

Hi, problems are shown when requests are sent as follows:

URL: https://upload.wikimedia.org/wikipedia/commons/thumb/f/fa/Flag_of_the_People%27s_Republic_of_China.svg/120px-Flag_of_the_People%27s_Republic_of_China.svg.png

Response:

accept-ranges: bytes
access-control-allow-origin: *
access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
age: 82000
content-length: 1053
content-type: application/x-www-form-urlencoded
date: Wed, 14 Aug 2019 14:03:36 GMT
etag: 80084a88fb8ce92d3bb8b4e95464e165
last-modified: Sun, 22 Jul 2018 06:42:06 GMT
server: ATS/8.0.3
server-timing: cache;desc="hit-front"
status: 200
strict-transport-security: max-age=106384710; includeSubDomains; preload
timing-allow-origin: *
x-analytics: WMF-Last-Access=25-Jul-2019;https=1
x-cache: cp2008 hit, cp2018 hit/76
x-cache-status: hit-front
x-client-ip: 45.35.251.212
x-object-meta-sha1base36: djawj4omfiqzf94c2lse08opy52vnnm
x-timestamp: 1532241725.64215
x-trans-id: txd4435aa1711f4505964a7-005d2f0ffa
x-varnish: 876628493 843349676

Request

:authority: upload.wikimedia.org
:method: GET
:path: /wikipedia/commons/thumb/f/fa/Flag_of_the_People%27s_Republic_of_China.svg/120px-Flag_of_the_People%27s_Republic_of_China.svg.png
:scheme: https
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
accept-encoding: gzip, deflate, br
accept-language: zh-CN,zh;q=0.9,zh-TW;q=0.8,en;q=0.7,ja;q=0.6
cache-control: no-cache
cookie: WMF-Last-Access=25-Jul-2019; GeoIP=<Removed>; WMF-Last-Access-Global=14-Aug-2019
dnt: 1
pragma: no-cache
sec-fetch-mode: navigate
sec-fetch-site: none
upgrade-insecure-requests: 1
user-agent: <Removed>

@aaron @brion given nobody on our team has much expertise with the areas this bug may touch, I wondered if you might chime in with any thoughts?

brion added a comment.Aug 14 2019, 6:08 PM

Hmm, I notice this in SwiftFileBackend.php:

	/**
	 * Sanitize and filter the custom headers from a $params array.
	 * Only allows certain "standard" Content- and X-Content- headers.
	 *
	 * When POSTing data, libcurl adds Content-Type: application/x-www-form-urlencoded
	 * if Content-Type is not set, which overwrites the stored Content-Type header
	 * in Swift - therefore for POSTing data do not strip the Content-Type header (the
	 * previously-stored header that has been already read back from swift is sent)
	 *
	 * @param array $params
	 * @return array Sanitized value of 'headers' field in $params
	 */
	protected function sanitizeHdrs( array $params ) {
		return isset( $params['headers'] )
			? $this->getCustomHeaders( $params['headers'] )
			: [];
	}

Could the correct Content-Type be missing for some reason sometimes?

brion added a comment.Aug 14 2019, 6:12 PM

The only POSTs I see in there are in setContainerAccess, addMissingHashMetadata, and doDescribeInternal, which all update existing files. Could be there's a hole in the logic of one of them that's dropping Content-Type, or that some files were missing it to begin with but it was being filled in on high-level fetches, or something, so they worked until something updated the data?

Removed a project tag accidentally

Change 530338 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] VCL: workaround for images delivered with CT:x-www-form-urlencoded

https://gerrit.wikimedia.org/r/530338

Change 530338 merged by Ema:
[operations/puppet@production] VCL: workaround for images delivered with CT:x-www-form-urlencoded

https://gerrit.wikimedia.org/r/530338

ema added a comment.Aug 15 2019, 10:46 AM

@Ciencia_Al_Poder, @Wang_Qiliang: I have added a workaround at the CDN level which replaces the wrong Content-Type based on file extension. Can you please check if the issue is still reproducible on your side?

Restricted Application added a project: Operations. · View Herald TranscriptAug 15 2019, 10:47 AM
ema moved this task from Triage to Caching on the Traffic board.Aug 15 2019, 10:48 AM

Issue solved for the examples provided

MBH removed a subscriber: MBH.Aug 15 2019, 11:50 AM
Vort added a comment.Aug 15 2019, 1:18 PM

@Wang_Qiliang I don't see application/x-www-form-urlencoded there.
The only noticable thing is that .png is downloaded as image/webp.

It returns content-type: image/png here (using both curl and browser).

The only POSTs I see in there are in setContainerAccess, addMissingHashMetadata, and doDescribeInternal, which all update existing files. Could be there's a hole in the logic of one of them that's dropping Content-Type, or that some files were missing it to begin with but it was being filled in on high-level fetches, or something, so they worked until something updated the data?

I think that's likely what's happening yes.

From swift's perspective here's the status as I understand it:

  • we're running with default post_as_copy = true option (default for 2.10 which we're running, changed to false in swift 2.13) thus POST does allow changing c-t at the cost of incurring in a full object copy. post_as_copy defaults to false in 2.13 because the previous buggy behavior has been fixed, thus POST'ing to change c-t in an existing object does not incur into a COPY and does the right thing. See also https://specs.openstack.org/openstack/swift-specs/specs/in_progress/fastpostupdates.html and https://wiki.openstack.org/wiki/Swift/FastPost
  • third party folks might be running with post_as_copy = false and swift releases < 2.13, and we should be mindful of those and provide instructions/documentation as needed when changing swiftfilebackend, see also two comments from @aaron at https://phabricator.wikimedia.org/T178849#3768032
  • Thumbor also uploads to swift nowadays, although TTBOMK content-type isn't changed and gets copied from the original at thumbnail generation time
  • swift has custom middleware to e.g. send thumbnail requests to the inactive datacenter, though that shouldn't be involved in changing c-t

I won't have a lot of time to further dig into this but happy to help as I can and field questions!

Vort removed a subscriber: Vort.Aug 22 2019, 11:07 AM