Page MenuHomePhabricator

Use one ownerDocument for the entire parse
Open, NormalPublic

Description

Currently, each pipeline creates its own document and, at various stages, the results of those parses need to be adopted by the main document.

We expect some performance gains by eliminating that work.

See https://gerrit.wikimedia.org/r/#/c/385312/

Also, there're a few instances where we have a dummyDoc to avoid string concatenation which should make use of it.

Event Timeline

Arlolra created this task.Oct 26 2017, 4:28 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 26 2017, 4:28 PM
Arlolra triaged this task as Normal priority.Oct 26 2017, 4:32 PM
cscott added a subscriber: cscott.Oct 26 2017, 4:37 PM

I think generally the pattern should be:

var df = env.ownerDocument.createDocumentFragment();
var tempBody = env.ownerDocument.createElement('body');
df.append(tempBody);
tempBody.innerHTML = "some string to parse";

as opposed to:

var newDoc = domino.createDOMImplementation().createHTMLDocument();
newDoc.documentElement.innerHTML = "some string to parse";

This uses Document.createDocumentFragment for storage of a DOM tree which isn't directly linked into the document itself.

bearND added a subscriber: bearND.Oct 26 2017, 5:17 PM