MWHttpRequest's redirect behavior is terrible
Open, NormalPublic

Description

Per T102566#1449713 and beyond, and T103043#1381527 and the reply...our current behavior of $followRedirects in MWHttpRequest and children is problematic and renders it basically useless. It's an all-or-none proposition, with no sorts of verifications along the way to make sure we're doing safe things.

We should do the following:

  1. Always allow redirects from HTTP -> HTTPS versions of the same URL (or domain?)
  2. Never allow redirects from HTTPS -> HTTP unless the URL (or domain?) matches.
  3. Remove the flag for enabling/disabling redirection after (1) and (2) are done.

Allowing us to always redirect in safe cases and never redirect in unsafe cases allows this to behave in a way that will actually make it useful and help things like HTTP -> HTTPS transitions much easier on our users.

demon created this task.Jul 14 2015, 1:19 AM
demon added a subscriber: demon.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 14 2015, 1:19 AM
BBlack added a subscriber: BBlack.EditedJul 14 2015, 1:30 AM

I really don't know what all the fallout implications are for (3). Are there cases where we do want to follow general redirects?

Re: domain-vs-hostname-vs-URL for the redirect-safety distinction: for the most-common HTTPS case, allowing a switch of protocol only (everything matches except for HTTP->HTTPS) would be sufficient. You could argue for same-domain too, but that gets tricky with defining authority boundaries for security: should the boundary be at the first, second, third, or beyond level of domainname (e.g. tesco.co.uk vs wikimedia.org)?

I'd say for the (1)+(2) part: always allow a switch from HTTP to HTTPS iff the rest of the URL is identical (clear protocol-upgrade redirect case), regardless of whether the flag can be killed in (3) or what it's set to if it isn't. And don't bother implementing a downgrade option for HTTPS->HTTP.

demon added a subscriber: csteipp.Jul 14 2015, 1:41 AM

I really don't know what all the fallout implications are for (3). Are there cases where we do want to follow general redirects?

Wherps, sorry if I was unclear. I was mainly writing this from the approach that redirects are bad unless we decide they're ok. If we removed the flag, we'd make the decision consciously that redirects are inherently bad unless they match one of the domain/URL/protocol match cases.

If we don't remove it, I assume it would be repurposed into a "take off the safety and blow your feet off" switch.

Re: domain-vs-hostname-vs-URL for the redirect-safety distinction: for the most-common HTTPS case, allowing a switch of protocol only (everything matches except for HTTP->HTTPS) would be sufficient. You could argue for same-domain too, but that gets tricky with defining authority boundaries for security: should the boundary be at the first, second, third, or beyond level of domainname (e.g. tesco.co.uk vs wikimedia.org)?

Yeah, I think this part is debatable. The only reason I asked was because it came up in an IRC convo between myself and @csteipp about it. His input here would be good :)

I'd say for the (1)+(2) part: always allow a switch from HTTP to HTTPS iff the rest of the URL is identical (clear protocol-upgrade redirect case), regardless of whether the flag can be killed in (3) or what it's set to if it isn't. And don't bother implementing a downgrade option for HTTPS->HTTP.

Yeah, so I think it'll be (1) + do we decide about repurposing the flag per earlier in this reply.

Tgr added a subscriber: Tgr.Jul 14 2015, 4:13 AM

Sometimes redirects are intentional, e.g. Special:Random. I don't think there is anything to gain by getting rid of them completely.

demon added a comment.Jul 14 2015, 1:00 PM

Sometimes redirects are intentional, e.g. Special:Random. I don't think there is anything to gain by getting rid of them completely.

That has nothing to do with how MW fetches HTTP content though (which is what we're talking about here), that's just a redirect header after we load the initial page and find out it's a redirect.

I like giving control with sane defaults, so I'd opt for making followRedirects a flag that indicates the behavior.

  • false - don't follow
  • 'ssl' (make this the default) - only allow http->https
  • 'samedomain' - allow redirects anywhere in the same fqdn
  • true - follow all redirects
hashar added a subscriber: hashar.Jul 15 2015, 8:49 AM

MWHttpRequest refuses to follow a HTTP->HTTPS redirect

Our Http class has followRedirects defaulting to false with the comment:

Note: this should only be used when the target URL is trusted, to avoid attacks on intranet services accessible by HTTP.

And indeed we don't set followRedirects when setting up the wikimediacommons wgForeignFileRepos.

@demon follow up by filling T105765: MWHttpRequest's redirect behavior is terrible