Page MenuHomePhabricator

Update to pegjs 0.8
Closed, ResolvedPublic

Description

0.8 promises better performance, besides a bunch of other improvements:

https://github.com/gwicke/pegjs/blob/master/CHANGELOG.md

At a minimum, we'll need to change all references to pos and pos0 to peg$pos etc, and also rework the cache key patch regexp in mediawiki.tokenizer.peg.js.

Also relevant:

"Removed the toSource method of generated parsers and introduced a new output option of the PEG.buildParser method. It allows callers to specify whether they want to get back the parser object or its source code."


Version: unspecified
Severity: enhancement

Details

Reference
bz60517

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:08 AM
bzimport set Reference to bz60517.

Rather than using peg$pos we should probably use the public offset() methods now available in 0.8.

The current tokenizer (0.6) is responsible for 24% of our cpu time when parsing [[en:Barack Obama]], so there is a good amount of potential here.

Change 130561 had a related patch set uploaded by Arlolra:
WIP: Upgrade to pegjs v0.8

https://gerrit.wikimedia.org/r/130561