Page MenuHomePhabricator

Loading full versions of larger images from Commons stucks / repeatedly gets interrupted after a few MBs
Closed, ResolvedPublic

Description

Reported at https://commons.wikimedia.org/wiki/Commons:Forum#Bilder_werden_nicht_mehr_komplett_dargestellt.

Since today the loading of a full version of an image from Commons stucks often. Only a part of the image is loaded/shown it seems to happen on every browser. See same examples in above link.

Event Timeline

(Removing Wikimedia-production-error tag as we don't have a stacktrace of a crash, but not sure about the correct tags here either.)

At first I wondered if this is T190988: Recently more broken files (premature end of file) that were cross-wiki uploaded to Commons by several users but I get the correct complete image file once wget has automatically re-tried the download, so I assume the files are stored correctly on the server but the connection to the server gets reset for unknown reasons (I'm trying while on a train though):

$:acko\> wget "https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg"
--2018-12-01 18:37:49--  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 91.198.174.208, 2620:0:862:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14790582 (14M) [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i  20%[=======================>                                                                                              ]   2.91M   377KB/s    in 8.8s    

2018-12-01 18:37:58 (340 KB/s) - Connection closed at byte 3053243. Retrying.

--2018-12-01 18:37:59--  (try: 2)  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 14790582 (14M), 11737339 (11M) remaining [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i 100%[++++++++++++++++++++++++=============================================================================================>]  14.10M   241KB/s    in 41s     

2018-12-01 18:38:40 (283 KB/s) - ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’ saved [14790582/14790582]

$:acko\> rm Berlin\,_Museum_Europäischer_Kulturen\,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_\(2018\)_NIK_6068.jpg 
$:acko\> wget -v "https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg"
--2018-12-01 18:42:05--  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 91.198.174.208, 2620:0:862:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14790582 (14M) [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i  24%[===========================>                                                                                          ]   3.46M   119KB/s    in 22s     

2018-12-01 18:42:27 (163 KB/s) - Connection closed at byte 3633891. Retrying.

--2018-12-01 18:42:28--  (try: 2)  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 14790582 (14M), 11156691 (11M) remaining [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i 100%[++++++++++++++++++++++++++++=========================================================================================>]  14.10M  92.3KB/s    in 85s     

2018-12-01 18:43:54 (128 KB/s) - ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’ saved [14790582/14790582]

$:acko\> rm Berlin\,_Museum_Europäischer_Kulturen\,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_\(2018\)_NIK_6068.jpg 
$:acko\> wget -v "https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg"
--2018-12-01 18:45:22--  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 91.198.174.208, 2620:0:862:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14790582 (14M) [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i  16%[==================>                                                                                                   ]   2.38M  20.8KB/s    in 37s     

2018-12-01 18:45:59 (65.6 KB/s) - Connection closed at byte 2494314. Retrying.

--2018-12-01 18:46:00--  (try: 2)  https://upload.wikimedia.org/wikipedia/commons/9/9c/Berlin%2C_Museum_Europ%C3%A4ischer_Kulturen%2C_GLAM_on_Tour_im_Museum_Europ%C3%A4ischer_Kulturen_%282018%29_NIK_6068.jpg
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 14790582 (14M), 12296268 (12M) remaining [image/jpeg]
Saving to: ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’

Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_i 100%[+++++++++++++++++++==================================================================================================>]  14.10M   214KB/s    in 51s     

2018-12-01 18:46:52 (234 KB/s) - ‘Berlin,_Museum_Europäischer_Kulturen,_GLAM_on_Tour_im_Museum_Europäischer_Kulturen_(2018)_NIK_6068.jpg’ saved [14790582/14790582]
Aklapper renamed this task from Loading of the full version of images from Commons stucks to Loading full versions of larger images from Commons stucks / repeatedly gets interrupted after a few MBs.Dec 1 2018, 7:51 PM

We got a report from Canada on #wikimedia-tech and I confirm I see the connection closed here in Finland as well, so it doesn't depend on the ISP.

$ LANG=en wget https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
--2018-12-02 22:59:28--  https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 91.198.174.208, 2620:0:862:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 41624009 (40M) [image/jpeg]
Saving to: 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg'

Sunset_Toronto_Skyline_Panorama_from_Snake_I  16%[=============>                                                                              ]   6.36M  4.88MB/s    in 1.3s

2018-12-02 22:59:30 (4.88 MB/s) - Connection closed at byte 6671794. Retrying.

--2018-12-02 22:59:31--  (try: 2)  https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 41624009 (40M), 34952215 (33M) remaining [image/jpeg]
Saving to: 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg'

Sunset_Toronto_Skyline_Panorama_from_Snake_I 100%[++++++++++++++=============================================================================>]  39.70M  4.44MB/s    in 7.4s

2018-12-02 22:59:38 (4.48 MB/s) - 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg' saved [41624009/41624009]

$ LANG=en wget -S https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
--2018-12-02 23:03:14--  https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
Resolving upload.wikimedia.org (upload.wikimedia.org)... 91.198.174.208, 2620:0:862:ed1a::2:b
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response... 
  HTTP/1.1 200 OK
  Date: Sun, 02 Dec 2018 21:03:19 GMT
  Content-Type: image/jpeg
  Content-Length: 41624009
  Connection: keep-alive
  X-Object-Meta-Sha1Base36: l9pdky84pp6egzr37hchqf52mos2ui9
  Last-Modified: Sun, 25 Nov 2018 21:13:20 GMT
  Etag: 2a071c2d3446033b2c00a56506cb72ea
  X-Timestamp: 1543180399.61227
  X-Trans-Id: tx1752dbe0ae18411988a9d-005c0440ac
  X-Varnish: 176282865 171899500, 157720489 156379071, 554036984
  Via: 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)
  Age: 2026
  X-Cache: cp1076 hit/5, cp3034 hit/8, cp3035 miss
  X-Cache-Status: hit-local
  Server-Timing: cache;desc="hit-local"
  Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
  X-Analytics: https=1;nocookies=1
  Access-Control-Allow-Origin: *
  Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
  Timing-Allow-Origin: *
  Accept-Ranges: bytes
Length: 41624009 (40M) [image/jpeg]
Saving to: 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg.1'

Sunset_Toronto_Skyline_Panorama_from_Snake_I   6%[=====>                                                                                      ]   2.61M  3.08MB/s    in 0.8s

2018-12-02 23:03:15 (3.08 MB/s) - Connection closed at byte 2742090. Retrying.

--2018-12-02 23:03:16--  (try: 2)  https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
Connecting to upload.wikimedia.org (upload.wikimedia.org)|91.198.174.208|:443... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 206 Partial Content
  Date: Sun, 02 Dec 2018 21:03:21 GMT
  Content-Type: image/jpeg
  Content-Length: 38881919
  Connection: keep-alive
  X-Object-Meta-Sha1Base36: l9pdky84pp6egzr37hchqf52mos2ui9
  Last-Modified: Sun, 25 Nov 2018 21:13:20 GMT
  Etag: 2a071c2d3446033b2c00a56506cb72ea
  X-Timestamp: 1543180399.61227
  X-Trans-Id: tx1752dbe0ae18411988a9d-005c0440ac
  X-Varnish: 176282865 171899500, 149830047 156379071, 563647982
  Via: 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1), 1.1 varnish (Varnish/5.1)
  Accept-Ranges: bytes
  Age: 0
  X-Cache: cp1076 hit/5, cp3034 hit/9, cp3035 pass
  X-Cache-Status: hit-local
  Server-Timing: cache;desc="hit-local"
  Strict-Transport-Security: max-age=106384710; includeSubDomains; preload
  X-Analytics: https=1;nocookies=1
  Content-Range: bytes 2742090-41624008/41624009
  Access-Control-Allow-Origin: *
  Access-Control-Expose-Headers: Age, Date, Content-Length, Content-Range, X-Content-Duration, X-Cache, X-Varnish
  Timing-Allow-Origin: *
Length: 41624009 (40M), 38881919 (37M) remaining [image/jpeg]
Saving to: 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg.1'

Sunset_Toronto_Skyline_Panorama_from_Snake_I 100%[++++++=====================================================================================>]  39.70M  2.15MB/s    in 13s

2018-12-02 23:03:30 (2.80 MB/s) - 'Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg.1' saved [41624009/41624009]

I can indeed reproduce the problem when fetching e.g. https://upload.wikimedia.org/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg

Trying to download the file internally seems to work:

deploy1001:~$ wget https://ms-fe.svc.eqiad.wmnet/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
--2018-12-03 11:20:14--  https://ms-fe.svc.eqiad.wmnet/wikipedia/commons/8/8e/Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg
Resolving ms-fe.svc.eqiad.wmnet (ms-fe.svc.eqiad.wmnet)... 10.2.2.27
Connecting to ms-fe.svc.eqiad.wmnet (ms-fe.svc.eqiad.wmnet)|10.2.2.27|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 41624009 (40M) [image/jpeg]
Saving to: ‘Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg’

Sunset_Toronto_Skyline_Panora 100%[==============================================>]  39.70M  50.9MB/s    in 0.8s    

2018-12-03 11:20:15 (50.9 MB/s) - ‘Sunset_Toronto_Skyline_Panorama_from_Snake_Island.jpg’ saved [41624009/41624009]

I can reproduce it as well. Received sizes and execution times are not consistent, ranging from a few hundreds of byes to a couple of megabytes and a few secs respectively. This and more importantly the test done above by @fgiunchedi indicate something going awry in the communication between varnish and swift.

jijiki triaged this task as High priority.Dec 3 2018, 1:40 PM
jijiki added a subscriber: jijiki.

Should we merge this with T190988 or vice versa?

They seem different, as T190988 is about faulty uploads (which I presume would still look broken when fetched directly from Swift), and this is about ones that are correct on Swift but have issues fetched through Varnish?

This also affects CropTool (hosted on Tool Labs), I'm getting reports about problems at https://github.com/danmichaelo/croptool/issues/123

Change 477424 had a related patch set uploaded (by BBlack; owner: BBlack):
[operations/puppet@production] Revert "cache: stop using nhw admission policy"

https://gerrit.wikimedia.org/r/477424

Change 477424 merged by BBlack:
[operations/puppet@production] Revert "cache: stop using nhw admission policy"

https://gerrit.wikimedia.org/r/477424

I think the patch reverted above was at fault. What I can't be sure of is whether the reversion will help immediately, or will take some time. I suspect it will have a positive effect fairly quickly (as each failed ExpKill is going to nuke quite a few objects before it ultimately fails).

BBlack claimed this task.

I can't reproduce this anymore in my own testing. I'm assuming it's fixed for now, barring further reports of continuing breakage showing up.

Change 477573 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] cache_upload: hfp on frontends for large objects except for exp

https://gerrit.wikimedia.org/r/477573

Change 477574 had a related patch set uploaded (by Ema; owner: Ema):
[operations/puppet@production] cache: stop using nhw admission policy

https://gerrit.wikimedia.org/r/477574

Change 477573 merged by Ema:
[operations/puppet@production] cache_upload: hfp on frontends for large objects except for exp

https://gerrit.wikimedia.org/r/477573

Change 477574 merged by Ema:
[operations/puppet@production] cache: stop using nhw admission policy

https://gerrit.wikimedia.org/r/477574