Page MenuHomePhabricator

Parser::doBlockLevels performs poorly under HHVM
Closed, ResolvedPublic

Description

Parser::doBlockLevels() constructs regular expressions with unique strip markers. Each regular expression pattern is turned into a StaticString, which HHVM uses as a lookup key for the cached PCRE table. Since patterns with strip markers are unique by design, they are cache misses, and they get compiled and cached.

The results are:

  • The PCRE table fills up with garbage (patterns that are identical save for the strip markers).
  • Memory bloats with pattern StaticStrings.

Version: unspecified
Severity: normal

Details

Reference
bz72205

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:47 AM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz72205.
ori created this task.Oct 18 2014, 12:53 AM

Change 167411 had a related patch set uploaded by Ori.livneh:
Use a fixed regex for StripState

https://gerrit.wikimedia.org/r/167411

Change 167411 merged by jenkins-bot:
Use a fixed regex for StripState

https://gerrit.wikimedia.org/r/167411

Change 167530 had a related patch set uploaded by Ori.livneh:
Re-use marker strings across requests

https://gerrit.wikimedia.org/r/167530

(In reply to Gerrit Notification Bot from comment #2)

Change 167411 merged by jenkins-bot:
Use a fixed regex for StripState
https://gerrit.wikimedia.org/r/167411

This patch was reverted in change Ic193abcff8c72b0c8b.

hashar set Security to None.
tstarling closed this task as Resolved.Mar 9 2015, 3:59 AM

This was fixed by implementing an LRU cache for compiled PCRE patterns in HHVM.