Page MenuHomePhabricator

PHP Warning: preg_match(): Compilation failed: two named subpatterns have the same name at offset 62
Open, HighPublic

Description

Error

MediaWiki version: 1.35.0-wmf.1

message
PHP Warning: preg_match(): Compilation failed: two named subpatterns have the same name at offset 62

Impact

Notes

Details

Request ID
XaJqMwpAICoAAHe-eQwAAAAH
Request URL
/wiki/Wikimedia_Taiwan/wiki/index.php5/$1
Stack Trace
exception.trace
#0 [internal function]: MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 /includes/PathRouter.php(311): preg_match(string, string, NULL)
#2 /includes/PathRouter.php(285): PathRouter::extractTitle(string, stdClass)
#3 /includes/PathRouter.php(260): PathRouter->internalParse(string)
#4 /includes/WebRequest.php(205): PathRouter->parse(string)
#5 /includes/WebRequest.php(361): WebRequest::getPathInfo(string)
#6 /includes/Setup.php(780): WebRequest->interpolateTitle()
#7 /includes/WebStart.php(81): require_once(string)
#8 /index.php(41): require(string)
#9 /srv/mediawiki/w/index.php(3): require(string)
#10 {main}

Event Timeline

Krinkle created this task.Oct 13 2019, 12:29 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 13 2019, 12:29 AM

Quick glance:

  • The wgArticlePath is configured as /wiki/$1 which means anything after /wiki/ that isn't a query string, should be considered as the page title. And titles like "My_$1_coin" should (and generally, do) work fine.
  • The PathRouterTest cases also cover this case.
  • In PathRouter::extractTitle the code correctly uses preg_quote() to escape the configured router path, and never embeds or substitutes the user input URL into the regex pattern (Good).

Some ad-hoc debugging on mwdebug1002 from this function reveals it tries three configured router paths:

PathRouter::extractTitle
 path = /wiki/Wikimedia_Taiwan/wiki/index.php5/$1  pattern=/wiki/$1  SCRIPT_NAME=/wiki/Wikimedia_Taiwan/wiki/index.php5/$1

PathRouter::extractTitle:
 path = /wiki/Wikimedia_Taiwan/wiki/index.php5/$1  pattern=/w/index.php/$1  SCRIPT_NAME=/wiki/Wikimedia_Taiwan/wiki/index.php5/$1

PathRouter::extractTitle:
 path = /wiki/Wikimedia_Taiwan/wiki/index.php5/$1  pattern=/wiki/Wikimedia_Taiwan/wiki/index.php5/$1/$1  SCRIPT_NAME=/wiki/Wikimedia_Taiwan/wiki/index.php5/$1

This last one is the likely cause and appears to be due to something also feeding the url as router path, which seems bad. And worse, it doesn't escape it in any way.

I imagine this might also explain why requests for static.php are calling this path, and more over, trying to match for a /w/static.php/$1 path which definitely isn't meant to happen, e.g. from https://meta.wikimedia.org/w/skins/Vector/images/arrow-down.svg

PathRouter::extractTitle:
 path = /w/skins/Vector/images/arrow-down.svg  pattern=/wiki/$1  SCRIPT_NAME=/w/static.php

PathRouter::extractTitle:
 path = /w/skins/Vector/images/arrow-down.svg  pattern=/w/index.php/$1  SCRIPT_NAME=/w/static.php

PathRouter::extractTitle:
 path = /w/skins/Vector/images/arrow-down.svg  pattern=/w/static.php/$1  SCRIPT_NAME=/w/static.php
Anomie added a subscriber: Anomie.Oct 15 2019, 3:01 PM
path = /wiki/Wikimedia_Taiwan/wiki/index.php5/$1  pattern=/wiki/$1  SCRIPT_NAME=/wiki/Wikimedia_Taiwan/wiki/index.php5/$1

That seems wrong. $_SERVER['SCRIPT_NAME'] is supposed to be the current script's path, without the pathinfo part.

I imagine this might also explain why requests for static.php are calling this path, and more over, trying to match for a /w/static.php/$1 path which definitely isn't meant to happen, e.g. from https://meta.wikimedia.org/w/skins/Vector/images/arrow-down.svg

It seems that's intentional in WebRequest's code.

if ( isset( $_SERVER['SCRIPT_NAME'] )
    && strpos( $_SERVER['SCRIPT_NAME'], '.php' ) !== false
) {
    // Check for SCRIPT_NAME, we handle index.php explicitly
    // But we do have some other .php files such as img_auth.php
    // Don't let root article paths clober the parsing for them
    $router->add( $_SERVER['SCRIPT_NAME'] . "/$1" );
}

That was added in rMWae1d5aefbf45: Update img_auth.php and WebRequest code to handle non index.php scripts like…, which mentions T34486: WebRequest::getPathInfo() broken in img_auth.php on DreamHost.

daniel triaged this task as High priority.Nov 11 2019, 8:06 PM