Page MenuHomePhabricator

Redirect with a question mark '?' in the title treats everything following it as URL query part when updating the URL
Closed, ResolvedPublic

Event Timeline

The actual URL is produced by the PHP code in Article::showRedirectedFromHeader() (search for 'wgInternalRedirectTargetUrl') and it's wrong there. The JavaScript code in mediawiki.action.view.redirect.js just blindly uses it.

I can't reproduce this issue locally. It must be some kind of a problem with Wikimedia wikis' configuration (messed up short URLs? :P).

This probably doesn't have to do with the apache config.

Requesting this url to an appserver directly I get a '200 OK' response, so the redirect is most definitely done within mediawiki (via javascript?) and not via any fancy apache rewrite.

I suspect some difference can be made by the apache mod_proxy_fcgi that could just pass the url wrongly divided to HHVM.

Mentioned in SAL [2016-02-29T16:02:21Z] <_joe_> stopping hhvm on mw1050 (depooled) for testing bug T128380

Testing this on one machine after turning off HHVM and intercepting the data with nc -l 9000. I found that the data passed to HHVM is a mixed bag of already-decoded and non-decoded nonsense:

SCRIPT_URL=/wiki/Why_Me?_(Daniel_Johnston_Album) (decoded)
SCRIPT_FILENAME=proxy:fcgi://127.0.0.1:9000/srv/mediawiki/docroot/wikipedia.org/w/index.php/Why_Me%3F_(Daniel_Johnston_Album) (encoded)
SCRIPT_NAME=/wiki/Why_Me?_(Daniel_Johnston_Album) 
REQUEST_URI=/wiki/Why_Me%3F_(Daniel_Johnston_Album)

I am pretty sure this is the root of the problem.

@matmarex what is the relevant part of mediawiki's code? does it uses any of said variables?

MediaWiki code in Article::showRedirectedFromHeader() just does this:

// Construct a URL for the current page view, but with the target title
$query = $request->getValues();
...
$redirectTargetUrl = $this->getTitle()->getLinkURL( $query );

And then this:

// Add the script to update the displayed URL and
// set the fragment if one was specified in the redirect
$outputPage->addJsConfigVars( [
	'wgInternalRedirectTargetUrl' => $redirectTargetUrl,
] );

$request is a WebRequest:

public function __construct() {
	$this->requestTime = isset( $_SERVER['REQUEST_TIME_FLOAT'] )
		? $_SERVER['REQUEST_TIME_FLOAT'] : microtime( true );

	// POST overrides GET data
	// We don't use $_REQUEST here to avoid interference from cookies...
	$this->data = $_POST + $_GET;
}
public function getValues() {
	$names = func_get_args();
	if ( count( $names ) == 0 ) {
		$names = array_keys( $this->data );
	}

	$retVal = [];
	foreach ( $names as $name ) {
		$value = $this->getGPCVal( $this->data, $name, null );
		if ( !is_null( $value ) ) {
			$retVal[$name] = $value;
		}
	}
	return $retVal;
}

So as far as I can tell, MediaWiki just takes everything from $_GET directly ($_POST is probably not relevant here), and appends it to the new "redirected" URL. I don't really know how PHP/HHVM produces the $_GET values.

I think the same is happening on the image statement on this Wikidata-item, but it doesn't happen every time.

I think the same is happening on the image statement on this Wikidata-item, but it doesn't happen every time.

That looks like a different issue. Wikidata is rendering the link incorrectly (the '?' should be encoded as '%3F', but it's not). This apparently only affects JS code, the output from PHP is fine.

I think the same is happening on the image statement on this Wikidata-item, but it doesn't happen every time.

That looks like a different issue. Wikidata is rendering the link incorrectly (the '?' should be encoded as '%3F', but it's not). This apparently only affects JS code, the output from PHP is fine.

Sorry, could be T128078 then. It's always annoying that tasks get closed before the fix is deployed. Was only searching for open tasks.

matmarex added a subscriber: MaxSem.

I can't reproduce this problem any more. I assume this was fixed by migrating to PHP 7 (T176370).