Page MenuHomePhabricator

Username Blacklist creates huge regexes, which could unintentionally crash PCRE
Closed, DeclinedPublic

Description

Author: fran

Description:
Replaces the 'mega-regex' approach with one regex for each blacklist entry

Recently the sysadmins disabled the UsernameBlacklist extension on WMF wikis because it was causing account creation to crash the servers, due to some complicated regexes in place on en.wiki. Someone pointed out on wikitech-l that it was odd that TitleBlacklist. with even more complicated entries on en.wiki, didn't cause the same problems, and noted that the only real difference in how they worked is that the UsernameBlacklist concatenated all the blacklist entries into one "mega-regex."

Looking at the code, I think this 'mega-regex' is the root of the problem. The function UsernameBlacklist::safeBlacklist() in UsernameBlacklist.php concatenates every single entry in the blacklist into one regular expression, which it then passes to preg_match(). However, PCRE's documentation notes that as regexes are processed by a recursive function in the C library, which could potentially cause a stack overflow and crash the process - in this case, Apache/PHP. I surmise this is the problem that we've been having - multiple complicated blacklist entries are combined by UsernameBlacklist into a huge regex which causes PCRE to overflow when parsing.

I've attached a patch for UsernameBlacklist that replaces the "mega-regex" approach with one regex for each blacklist entry, the same approach taken by TitleBlacklist.


Version: unspecified
Severity: major

Attached:

Details

Reference
bz14941

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:09 PM
bzimport set Reference to bz14941.
bzimport added a subscriber: Unknown Object (MLST).

lilewyn wrote:

Fix included in patch submitted in 15010

fran wrote:

I'm working on replacing UsernameBlacklist with an extended TitleBlacklist, so marking this WONTFIX.