Page MenuHomePhabricator

upload.wikimedia.org returns HTTP 501 instead of 416 for non-satisfiable byte ranges
Closed, DuplicatePublic

Description

Per spec, server should return status code 416 'Requested Range Not Satisfiable' for requests where a Range header doesn't overlap the available file data:
https://tools.ietf.org/html/rfc2616#section-10.4.17

Currently I'm seeing HTTP 501 responses instead for this case. Examples:

# In range, returns HTTP 206
curl -v  'https://upload.wikimedia.org/wikipedia/commons/transcoded/7/7c/Caminandes_-_Gran_Dillama_-_Blender_Foundation%27s_new_Open_Movie.webm/Caminandes_-_Gran_Dillama_-_Blender_Foundation%27s_new_Open_Movie.webm.480p.webm' \
-XGET \
-H 'Range: bytes=20000000-20076624' > /dev/null
# Out of range, returns HTTP 501
curl -v  'https://upload.wikimedia.org/wikipedia/commons/transcoded/7/7c/Caminandes_-_Gran_Dillama_-_Blender_Foundation%27s_new_Open_Movie.webm/Caminandes_-_Gran_Dillama_-_Blender_Foundation%27s_new_Open_Movie.webm.480p.webm' \
-XGET \
-H 'Range: bytes=20076624-21125199' > /dev/null

This is probably relatively harmless, but it would be better to return a 416 to provide a Content-Range header with the actually-available byte range.

(Note that a range that strands from within the data to beyond end of file will return the part that is valid. This only affects when requesting a range that's entirely outside the file.)

Event Timeline

brion created this task.Oct 2 2016, 8:59 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 2 2016, 8:59 PM
Krenair added a subscriber: Krenair.Oct 2 2016, 9:07 PM
Ankry added a subscriber: Ankry.Oct 2 2016, 9:07 PM
fgiunchedi added a subscriber: fgiunchedi.

@brion it seems this is fixed in newer swift versions (I get 416 from curl -v http://ms-fe.esams.wmnet/monitoring/backend -H 'Range: bytes=42-43' -O /dev/null)

brion closed this task as Resolved.Aug 13 2017, 8:52 PM

Per above, this has been fixed presumably by updates to Swift. Closing; thanks!

ema reopened this task as Open.Mar 29 2018, 7:02 AM
ema triaged this task as Medium priority.
ema added a project: Traffic.
Restricted Application added a project: Operations. · View Herald TranscriptMar 29 2018, 7:03 AM
ema added a subscriber: ema.Mar 29 2018, 7:16 AM

Reopening this bug as swift still returns 501 when it should return 416.

I have noticed a sudden surge of 501 responses on cache upload this morning. All requests resulting in 501s that I have been able to capture with varnishlog where sent by User-Agent: Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98) and all had the peculiarity of being Range requests for bytes=${CONTENT_LENGTH}-, which are unsatisfiable.

For example:

$ curl -s -I http://ms-fe.svc.eqiad.wmnet/wikipedia/commons/thumb/8/8e/%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG/576px%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG | grep Content-Length                                                        
Content-Length: 71157

Trying to request bytes 71157-:

$ curl -s http://ms-fe.svc.eqiad.wmnet/wikipedia/commons/thumb/8/8e/%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG/576px-%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG -H 'Range: bytes=71157-' ; echo                                                 
501 Not Implemented                                                                                                                                           

 The request method GET is not implemented for this server. 

 Unknown Status: 416

There's quite some confusion there of course: the error message mentions GET being not implemented (!) and the proper status code, 416, is mentioned in the response body (perhaps swift backend does the right thing and swift frontend gets confused?).

ema moved this task from Triage to Watching on the Traffic board.Mar 29 2018, 7:40 AM
fgiunchedi removed fgiunchedi as the assignee of this task.Mar 29 2018, 10:14 AM
fgiunchedi added a project: User-fgiunchedi.

Quite possible! Initially I thought it might have to do with thumbnails (and thus thumbor in the pipeline) but originals have the same problem:

tin:~$ curl -v -H 'Range: bytes=1537776-' http://ms-fe.svc/wikipedia/commons/8/8e/%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG 
* Hostname was NOT found in DNS cache
*   Trying 10.2.2.27...
* Connected to ms-fe.svc (10.2.2.27) port 80 (#0)
> GET /wikipedia/commons/8/8e/%22El_Padre_Posadas_dando_las_reglas_y_constituciones_a_los_servitas_de_San_Jacinto_en_1707%22._Lienzo_del_siglo_XIX._Iglesia_de_los_Dolores_de_C%C3%B3rdoba.JPG HTTP/1.1
> User-Agent: curl/7.38.0
> Host: ms-fe.svc
> Accept: */*
> Range: bytes=1537776-
> 
< HTTP/1.1 501 Not Implemented
< Content-Length: 103
< Content-Type: text/plain; charset=UTF-8
< X-Trans-Id: txc2c42fa7eca0445c910eb-005abcbaa7
< Date: Thu, 29 Mar 2018 10:06:31 GMT
< 
501 Not Implemented

 The request method GET is not implemented for this server. 

* Connection #0 to host ms-fe.svc left intact
 Unknown Status: 416tin:~$

Interestingly that error message is from webob, which swift itself has stopped using (uses swob instead, bundled). Our rewrite middleware does use webob though and I'm assuming it is the culprit here.

I'm removing myself as an assignee though as I don't think I'll be able to prioritize this soon, adding it to my workboard though.