Page MenuHomePhabricator

Scribunto modules seem to assume presence of global document state via a global parser object
Closed, DeclinedPublic

Description

As T269979#6690071 notes, Scribunto modules that use parser functionality like unstrip* implicitly assume that there is a shared global parser object in which state can be stored that can be reliably accessed across invocations. Without that, unstrip* and other such methods will fail because the state they expect to query won't be accessible.

Parsoid has been gradually moving decoupled and independent parsing of extension and template content and Scribunto breaks this model.

This was broken in the Parsoid/JS model where Parsoid issues MediaWiki API requests for expanding / preprocessing templates (where every such API call would instantiate a fresh copy of the Parser object).

While Parsoid/PHP maintains a single copy of its own parser object in DataAccess.php, this is still broken because that copy is probably different from whatever Scribunto has.

So, we need to figure out how to resolve this temporarily. And, longer term, we should try to figure out how to handle this in a way that doesn't break the independent / decoupled parsing strategies Parsoid wants to follow.

While it seems reaasonabl that a document is parsed via a shared global parser object, what happens if Parsoid uses incremental or other cache-based parsing strategies where some constructs might never get parsed and state corresponding to that is not maintained?

So, at the very least, we might need to add tracking code in Parsoid / preprocessor so we fall back to sequential parsing modes where such support is needed.