Page MenuHomePhabricator

Special characters in URL lead to redirect loop under IIS 7.5
Closed, DuplicatePublic

Description

While testing an upgrade from 1.17 to 1.26.2, I found that using a Short URL to access any of the special pages resulted in Chrome complaining that the page was looping its redirects.
Further investigation showed that this occurs for a number of special characters (at least colon and ampersand).

I had previously (when installing 1.17) resolved this issue on IIS 7.5 by ensuring that the value of the title parameter generated by the Short URL handling's redirect was url-encoded. In this instance, that fix (which was still in place) made no difference. I also tried a variety of values for $wgUsePathInfo and $wgArticlePath, with no effect.

Looking through the source code, I found that the issue appears to be in the tryNormaliseRedirect method within includes/MediaWiki.php, where the value of the requested url is now compared to what appears to be believed to be the canonical url (changed in 155d555b83eca6403e07d2094b074a8ed2f301ae):

		$targetUrl = wfExpandUrl( $title->getFullURL(), PROTO_CURRENT );

		if ( $targetUrl != $request->getFullRequestURL() ) {
			$output->setSquidMaxage( 1200 );
			$output->redirect( $targetUrl, '301' );
			return true;
		}

Due to what appears to be an IIS/PHP bug (detail below), I believe that this comparison will always fail under IIS for urls which contain special characters. During debugging I found that the value of $targetUrl will contain the url-encoded version of the url, while getFullRequestURL returns it unencoded.

I believe that this in turn is a result of PHP's `$_SERVER['REQUEST_URI'] behaving differently under Apache than IIS, as follows:

Apache 2.2.14, Ubuntu 10.04.4 LTS, PHP 5.3.2
 Postfix                         $_SERVER['REQUEST_URI']

'?title=Special%3ASpecialPages': 'string(40) "/rq_uri.php?title=Special%3ASpecialPages"'
'?title=Special:SpecialPages':   'string(38) "/rq_uri.php?title=Special:SpecialPages"'
'/Special%3ASpecialPages':       'string(34) "/rq_uri.php/Special%3ASpecialPages"'
'/Special:SpecialPages':         'string(32) "/rq_uri.php/Special:SpecialPages"'

IIS 7.5.7600.16385, Windows Server 2008 R2, PHP 5.3.28 (also tested under 5.6)
 Postfix                         $_SERVER['REQUEST_URI']

'?title=Special%3ASpecialPages': 'string(40) "/rq_uri.php?title=Special%3ASpecialPages"'
'?title=Special:SpecialPages':   'string(38) "/rq_uri.php?title=Special:SpecialPages"'
'/Special%3ASpecialPages':       'string(32) "/rq_uri.php/Special:SpecialPages"'
'/Special:SpecialPages':         'string(32) "/rq_uri.php/Special:SpecialPages"'

Note the difference in the third line of each set of samples -- where the colon is encoded in the original request, but not in the value reported.
These samples were generated by querying a simple PHP script which does var_dump($_SERVER['REQUEST_URI']) with a variety of urls. I can provide the calling script if needed.

For reference, the value of $_SERVER['PHP_SELF'] under IIS also shows the above issue for me (I considered using it to build a replacement $_SERVER['REQUEST_URI'], but couldn't).

I appreciate that the underlying issue isn't in MediaWiki, though it would be great if this could be fixed as it is a regression over previously working behaviour.

Event Timeline

I guess we will need to revert it now.

Change 309575 had a related patch set uploaded (by Paladox):
Revert "MediaWiki.php: Redirect non-standard title urls to canonical"

https://gerrit.wikimedia.org/r/309575

Kghbln triaged this task as Unbreak Now! priority.EditedNov 2 2016, 1:10 PM

To cut it short. The suggestion from Ryan of ryadel.com is to replace

if ( $targetUrl != $request->getFullRequestURL() ) {
  $output->setCdnMaxage( 1200 );
  $output->redirect( $targetUrl, '301' );
  return true;
}

with

if (strpos(strtolower($_SERVER["SERVER_SOFTWARE"]), "microsoft-iis") !== false) {
  // The script is running under IIS: urldecode everything to fix $_SERVER['REQUEST_URI'] wrong return value.
  // Ref.: https://phabricator.wikimedia.org/T127734
  if ( urldecode($targetUrl) != urldecode($request->getFullRequestURL()) )  {
    $output->setCdnMaxage( 1200 );
    $output->redirect( $targetUrl, '301' );
    return true;
  }
} else {
  // The script is not running under IIS: use the default behaviour
  if ( $targetUrl != $request->getFullRequestURL() ) {
    $output->setCdnMaxage( 1200 );
    $output->redirect( $targetUrl, '301' );
    return true;
  }
}

Perhaps this also solves the issues for Apache too which are also reported with an increasing frequency as people finally move on beyond MW 1.25. Changing priority to "Unbreak Now!" since this appears to be the only status that triggers reactions if at all. Thanks for your involvement.

Aklapper lowered the priority of this task from Unbreak Now! to Medium.Nov 2 2016, 2:40 PM

Please don't abuse the priority field to gain attention - there are e.g. IRC, emails, mailing lists to make a problem known to a bigger audience. Thanks.

Kghbln raised the priority of this task from Medium to High.Nov 2 2016, 2:43 PM

That's no abuse. That's last resort. This was on the mailinglist, this rolls in on support desk, there are related tasks, people have been pinged, a revert has be authored. Please make another suggestion to avoid this from happening next week again. Thanks.