Page MenuHomePhabricator

Pathological test case for Parsoid wt->html
Open, HighPublic

Description

On this page with about ~37K links ( https://ce.wikipedia.org/wiki/Декъашхо%3ATakhirgeran_Umar%2FНах_беха_меттигаш ), when you switch off the resource limit constraints, curl of this page on scandium times out with Parsoid/PHP whereas Parsoid/JS finishes in 24 seconds.

Flamegraph locally with parse.php shows WTUtils::reinsertFosterableContent and ProcessTreeBuilderFixups::deleteShadowMeta being the functions that take the most time.

Like some O(n^2) behavior.

Related Objects

Event Timeline

ssastry triaged this task as Medium priority.Oct 23 2019, 4:06 PM
ssastry created this task.

Looks like both these functions have a $node->parentNode->replaceChild(...) and in this case, the parent node has 37K children ... so, this is what seems to be causing the blowup.

ssastry raised the priority of this task from Medium to High.Oct 25 2019, 2:14 PM

Does this mean that, unlike Domino, libxml rebuilds the children data-structure for replaceChild?