Page MenuHomePhabricator

class Parser: senseless use of non-existing regexp back reference
Closed, ResolvedPublic

Description

There's a small error in line

'/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2',

in file

http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/parser/Parser.php?view=markup

$2 (or \2 respectively) is not defined here, because zero-width look-ahead assertions (?=...) are non-capturing patterns.

I suggest one of the following solutions

'/(.) (\\?|:|;|!|%|\\302\\273)/' => '\\1 \\2',
'/(.) (?=\\?|:|;|!|%|\\302\\273)/' => '\\1 ',
'/(?<=.) (?=\\?|:|;|!|%|\\302\\273)/' => ' ',

I don't know which of those is the best. I guess the first one is the slowest.

Btw. for uniformity you could modify the lines

'/(\\302\\253) /' => '\\1&nbsp;',
'/&nbsp;(!\s*important)/' => ' \\1', #Beware of CSS magic word !important, bug #11874.

analogously.


Version: unspecified
Severity: minor
URL: http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/parser/Parser.php?view=markup

Details

Reference
bz17119

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:24 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz17119.
  1. Oh, &amp;nbsp; is parsed here in bugzilla..., sorry I did not know that.
  2. and additional to that you could do a little speed-up by grouping the singe-char alternatives by char classes. So the possible solutions would be

    '/(.) ([?:;!%]|\\302\\273)/' => '\\1&amp;nbsp;\\2',

or

'/(.) (?=[?:;!%]|\\302\\273)/' => '\\1&amp;nbsp;',

or

'/(?<=.) (?=[?:;!%]|\\302\\273)/' => '&amp;nbsp;',

Oops, sorry again, &nbsp; is not parsed, it was just a copy&paste error.
Third try:

A solution would be

'/(.) ([?:;!%]|\\302\\273)/' => '\\1&nbsp;\\2',

or

'/(.) (?=[?:;!%]|\\302\\273)/' => '\\1&nbsp;',

or

'/(?<=.) (?=[?:;!%]|\\302\\273)/' => '&nbsp;',

EN.WP.ST47 wrote:

simple patch that applies the suggested change

Attached patch is incredibly simple and removes the redundant \\2. I checked the rest of Parser.php, there are no other incidences of this bug. This changes no behaviour whatsoever. I tested it anyway and all parsertests still pass.

Attached:

EN.WP.ST47 wrote:

Fixed in r93925.