Page MenuHomePhabricator

Add special handling for annotation tags in template arguments
Open, MediumPublic

Description

This is a spinoff from T295406 based on rt-testing failures on metawiki:Wikimedia Community User Group Malaysia and metawiki:Wikimedia Wikimeet India 2021 which exposed a larger issue. How do we handle annotations found in arguments to templates?

Let us take translate as an example and look at {{some-template|arg1=<translate>foo</translate>|arg2=<translate>bar</translate>}}.

Translate extension currently strips these annotations from the arguments, so when this wikitext is transformed to HTML, what is actually transformed is {{some-template|arg1=foo|arg2=bar}}. However, Parsoid currently does not do this. It blithely passes the whole string right on through (even before we started working on T261181) which is incorrect. This is demonstrated by an example on mediawiki.org. Compare the output of the legacy parser on this page compared to Parsoid's output on the same.

The current behaviour of the Translate extension makes sense. Annotations exist to simply demarcate regions of wikitext and shouldn't impact processing otherwise and so should be stripped from all contexts where they might modify results. So, template and extension arguments should both have annotations stripped from them. But then, we need to identify a mechanism for recording these stripped annotation ranges so the extension implementing the annotation can extract them for its own processing. Stuffing them in the data-mw object in some form is one solution. A spec needs to be proposed and implemented.