Page MenuHomePhabricator

Thumbor doesn't save Content-Disposition: inline headers to Swift for webp thumbnails
Closed, ResolvedPublic

Description

It seems that Firefox saves our .webp transforms of .jpgs as files with the .jpg file extension. (unlike Chrome which automatically renames the filename to .webp). This leads to 'broken' files for some people.

File upstream bugreport ?
Maybe try automatically adding Content-Disposition: inline; filename*="filename.webp" on such files, seems we strip that header right now (unintentionally ?)

curl -v -O -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Bonette.jpg/320px-Bonette.jpg"

< HTTP/2 200 
< date: Thu, 04 Jun 2020 12:48:50 GMT
< content-type: image/jpeg
< content-length: 10814
< x-object-meta-sha1base36: ahk541f5c9nejdk8fj1uofq9r085p7q
< content-disposition: inline;filename*=UTF-8''Bonette.jpg
< last-modified: Thu, 30 Mar 2017 18:48:33 GMT
< etag: 13b9ebe4294ac964665f5eff7391d4b8
< x-timestamp: 1490899712.93189
< server: ATS/8.0.7
< age: 74344
< x-cache: cp3059 hit, cp3065 hit/2543
< x-cache-status: hit-front
< server-timing: cache;desc="hit-front"
< strict-transport-security: max-age=106384710; includeSubDomains; preload
< access-control-allow-origin: *
< access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
< timing-allow-origin: *
< accept-ranges: bytes

curl -v -O -H "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dc/Bonette.jpg/320px-Bonette.jpg"

< HTTP/2 200 
< date: Fri, 05 Jun 2020 00:28:05 GMT
< content-type: image/webp
< content-length: 11832
< last-modified: Fri, 26 Jul 2019 12:20:19 GMT
< etag: 64858bb14ad22a72049c432862316152
< x-timestamp: 1564143618.12390
< server: ATS/8.0.7
< age: 32455
< x-cache: cp3057 miss, cp3065 hit/601
< x-cache-status: hit-front
< server-timing: cache;desc="hit-front"
< strict-transport-security: max-age=106384710; includeSubDomains; preload
< access-control-allow-origin: *
< access-control-expose-headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache
< timing-allow-origin: *
< accept-ranges: bytes

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
TheDJ updated the task description. (Show Details)

I suspect this is a regression due to the ATS migration, this used to work properly. I've just verified and Thumbor does issue the header:

gilles@thumbor1001:~$ curl -I http://localhost:8800/wikipedia/commons/thumb/1/13/A._Philip_Randolph_1963_NYWTS.jpg/800px-A._Philip_Randolph_1963_NYWTS.webp

Content-Disposition: inline;filename*=UTF-8''A._Philip_Randolph_1963_NYWTS.jpg.webp

It's being incorrectly stripped by a layer in front of it.

Gilles renamed this task from Firefox's "save image as" of a webp version of one of our thumbnails, saves as a .jpg file to ATS or Varnish incorrectly strips Content-Disposition header for webp thumbnails.Jun 8 2020, 8:22 AM
Gilles assigned this task to ema.
Gilles added a project: Traffic.
Gilles renamed this task from ATS or Varnish incorrectly strips Content-Disposition header for webp thumbnails to Swift doesn't save or regenerate Content-Disposition: inline for thumbnails.Jun 8 2020, 8:32 AM
Gilles claimed this task.
Gilles triaged this task as Medium priority.
Gilles removed a project: Traffic.
Gilles added a subscriber: ema.
Gilles renamed this task from Swift doesn't save or regenerate Content-Disposition: inline for thumbnails to Swift doesn't save Content-Disposition: inline for webp thumbnails.EditedJun 8 2020, 8:37 AM

By poking at objects stored in Swift I've been able to establish that jpg thumbnails have the content-disposition header saved in the swift object, but webps don't. This has been hard to see because the first time a webp is generated the header is present in the response, and will get cached by the backend and frontend caches in front of swift. The issue only becomes visible once a webp thumbnail falls out of cache and is re-fetched from swift.

Gilles renamed this task from Swift doesn't save Content-Disposition: inline for webp thumbnails to Thumbor doesn't save Content-Disposition: inline headers to Swift for webp thumbnails.Jun 8 2020, 8:39 AM

The same might actually be true for all thumbnails, but might be masked by the fact that JPGs are much more likely to have been generated in the past by MediaWiki, which stored that header in Swift, and also more likely to stay warm in cache.

Change 603386 had a related patch set uploaded (by Gilles; owner: Gilles):
[operations/software/thumbor-plugins@master] Store Content-Disposition header in Swift

https://gerrit.wikimedia.org/r/603386

Change 603386 merged by Gilles:
[operations/software/thumbor-plugins@master] Store Content-Disposition header in Swift

https://gerrit.wikimedia.org/r/603386

Change 603876 had a related patch set uploaded (by Gilles; owner: Gilles):
[operations/debs/python-thumbor-wikimedia@master] Upgrade to 2.9

https://gerrit.wikimedia.org/r/603876

Change 603876 merged by Effie Mouzeli:
[operations/debs/python-thumbor-wikimedia@master] Upgrade to 2.9

https://gerrit.wikimedia.org/r/603876

Confirming that this header now gets saved in Swift for new files:

gilles@ms-fe1005:~$ curl -I http://localhost:80/wikipedia/commons/thumb/9/95/Chloe_Zhao_by_Gage_Skidmore.jpg/119px-Chloe_Zhao_by_Gage_Skidmore.jpg.webp
HTTP/1.1 200 OK
Content-Length: 7092
Content-Disposition: inline;filename*=UTF-8''Chloe_Zhao_by_Gage_Skidmore.jpg.webp
Accept-Ranges: bytes
Last-Modified: Mon, 12 Apr 2021 15:06:20 GMT
Etag: 3b31fadb6c9ff72ce5d84abbaac66a4c
X-Timestamp: 1618239979.46475
Access-Control-Allow-Origin: *
Content-Type: image/webp
X-Trans-Id: tx24598e22ac1d42268484c-006086ca79
Date: Mon, 26 Apr 2021 14:13:13 GMT

Purging older affected files fixes them.

Before purge:

gilles@ms-fe1005:~$ curl -I http://localhost:80/wikipedia/commons/thumb/d/dc/Bonette.jpg/320px-Bonette.jpg.webp
HTTP/1.1 200 OK
Content-Length: 11832
Accept-Ranges: bytes
Last-Modified: Fri, 26 Jul 2019 12:20:19 GMT
Etag: 64858bb14ad22a72049c432862316152
X-Timestamp: 1564143618.12390
Access-Control-Allow-Origin: *
Content-Type: image/webp
X-Trans-Id: tx888d669bf40442ec919e4-006086cab3
Date: Mon, 26 Apr 2021 14:14:11 GMT

After purge:

gilles@ms-fe1005:~$ curl -I http://localhost:80/wikipedia/commons/thumb/d/dc/Bonette.jpg/320px-Bonette.jpg.webp
HTTP/1.1 200 OK
Content-Length: 11832
Content-Disposition: inline;filename*=UTF-8''Bonette.jpg.webp
Accept-Ranges: bytes
Last-Modified: Mon, 26 Apr 2021 14:16:02 GMT
Etag: 64858bb14ad22a72049c432862316152
X-Timestamp: 1619446561.56965
Access-Control-Allow-Origin: *
Content-Type: image/webp
X-Trans-Id: tx8de86ab5159741c4a0d67-006086cb24
Date: Mon, 26 Apr 2021 14:16:04 GMT