Page MenuHomePhabricator

Look at selser failures in rt-testing caused by the Annotations feature
Closed, ResolvedPublic

Description

T295330: WrapAnnotations breaks DSR correctness in some cases was one instance of a bug that caused selser failures.

Here are some additional pages with selser failures that should be investigated.

metawiki:Wikimedia Community User Group Malaysia
metawiki:Wikimedia Wikimeet India 2021
metawiki:WMF_Resolutions/Recognition_of_Amical_Wikimedia
metawiki:WMF_Resolutions/2008-09_Budget

They may either be DSR issues OR selser issues.

Related Objects

StatusSubtypeAssignedTask
OpenReleaseNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenFeatureNone
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedEsanders
OpenFeatureNone
Resolvedihurbain
Resolvedihurbain
Resolvedssastry

Event Timeline

ssastry triaged this task as Medium priority.Nov 9 2021, 9:00 PM

On the metawiki:Wikimedia Wikimeet India 2021 page, the following session clarifies the problem:

[subbu@earth:~/work/wmf/parsoid] cat /tmp/wt
{{Infobox recurring event|people='''<translate><!--T:19--> Advisor</translate>:''' [[User:Lahariyaniyathi|<translate><!--T:20--> Tanveer Hasan</translate>]]}}
[subbu@earth:~/work/wmf/parsoid] php bin/parse.php --dump tplsrc --domain meta.wikimedia.org < /tmp/wt
[dump/tplsrc] ================================================================================
[dump/tplsrc] TEMPLATE: Template:Infobox_recurring_event ; TRANSCLUSION: "{{Infobox recurring event|people='''<translate><!--T:19--> Advisor</translate>:''' [[User:Lahariyaniyathi|<translate><!--T:20--> Tanveer Hasan</translate>]]}}"
[dump/tplsrc] --------------------------------------------------------------------------------
[dump/tplsrc] <templatestyles src="Module:Infobox/styles.css"></templatestyles><table class="infobox" style="width:22em"><tr><th colspan="2" style="text-align:center;font-size:125%;font-weight:bold">MyLanguage/Main Page</th></tr><tr><th scope="row">People</th><td>'''</td></tr><translate> Advisor</translate>:''' [[User:Lahariyaniyathi|<tr><td colspan=2>
<translate> Tanveer Hasan</translate>]]</td></tr></table>
[dump/tplsrc] --------------------------------------------------------------------------------
<style data-mw-deduplicate="TemplateStyles:r21671784" typeof="mw:Extension/templatestyles mw:Transclusion" about="#mwt1" data-parsoid='{"pi":[[{"k":"people","named":true}]],"dsr":[0,158,null,null]}' data-mw='{"parts":[{"template":{"target":{"wt":"Infobox recurring event","href":"./Template:Infobox_recurring_event"},"params":{"people":{"wt":"&apos;&apos;&apos;&lt;translate>&lt;!--T:19--> Advisor&lt;/translate>:&apos;&apos;&apos; [[User:Lahariyaniyathi|&lt;translate>&lt;!--T:20--> Tanveer Hasan&lt;/translate>]]"}},"i":0}}]}'>.mw-parser-output .infobox{float:right;clear:right;margin-bottom:0.5em;margin-left:1em;padding:0.2em;border:1px solid #A2A9B1;background:#F8F9FA;color:black}.mw-parser-output .infobox td,.mw-parser-output .infobox th{vertical-align:top}.mw-parser-output .infobox caption{margin-left:inherit;font-size:larger}.mw-parser-output .infobox.bordered{border-collapse:collapse}.mw-parser-output .infobox.bordered td,.mw-parser-output .infobox.bordered th{border:1px solid #A2A9B1}.mw-parser-output .infobox.bordered .borderless td,.mw-parser-output .infobox.bordered .borderless th{border:0}</style><p about="#mwt1"> Advisor:<a rel="mw:WikiLink" href="./User:Lahariyaniyathi" title="User:Lahariyaniyathi">
<meta typeof="mw:Annotation/translate" data-mw='{"rangeId":"mwa0","extendedRange":false,"wtOffsets":[341,352]}'/> Tanveer Hasan<meta typeof="mw:Annotation/translate/End" data-mw='{"wtOffsets":[366,378]}'/></a></p><table class="infobox" style="width:22em" about="#mwt1" data-parsoid='{"stx":"html"}'><tbody><tr><th colspan="2" style="text-align:center;font-size:125%;font-weight:bold">MyLanguage/Main Page</th></tr><tr><th scope="row">People</th><td><b></b></td></tr><meta typeof="mw:Annotation/translate" data-mw='{"rangeId":"mwa1"}'/><meta typeof="mw:Annotation/translate/End" data-mw='{"rangeId":"mwa1"}'/> </tbody></table>

There is text including translate tags in a fosterable position and this is not properly detected and hoisted out by the WrapAnnotations code. There was discussion about this and code to handle this as well, so probably some bug or edge case issue in that algo. Anyway, leaving behind the meta tags in a fosterable position then is a cause of dirty diffs.

Okay, I looked at this briefly and this is something we missed but annotations in template arguments cannot be passed through as is since (a) passing them through changes the behavior of templates since the arguments are different from what is expected from annotations (b) it also breaks the expectations of the wrap-annotations pass where it expects to see valid TSR values and they aren't valid when tokenized in template source. They should be stripped before passing them on to the core preprocessor. This stripping could be done in the tokenizer OR in the template-handler. Probably better to do it in the tokenizer. Separately, we would have to capture the translate ranges for template arguments separately and add it to data-mw probably in a new key.

I might implement the first bit to unblock this.

As to why this breaks, in this particular example, passing the <translate> tag through to some arguments leads to the output have content in fosterable positions whereas stripping them doesn't. Additionally, TSR values are also missing for the ranges from templates. So, these annotation ranges aren't recognized by DOMRangeBuilder and they don't get fostered-content protection and everything breaks downstream.

I was curious why <includeonly> and such tags aren't impacted by fostering and turns out that is because we let those metas get fostered by the tree builder (unlike annotation metas). But, maybe we should let annotation metas get fostered out as well since we can detect when the metas came from the table (because they show up in the foster box) - that could also simplify the foster-protection logic in the annotation wrapping code.

Change 737825 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Strip annotations seen in template arguments for now

https://gerrit.wikimedia.org/r/737825

Change 737934 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Don't break template continuiting when moving annotation range metas

https://gerrit.wikimedia.org/r/737934

Change 738029 had a related patch set uploaded (by Subramanya Sastry; author: Subramanya Sastry):

[mediawiki/services/parsoid@master] Hack: Drop annotation tokens in template context

https://gerrit.wikimedia.org/r/738029

Change 737934 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Don't break template continuity when moving annotation range metas

https://gerrit.wikimedia.org/r/737934

Change 738029 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Hack: Drop annotation tokens in template context

https://gerrit.wikimedia.org/r/738029

I've opened followup tasks for the unresolved pieces.

Change 742570 had a related patch set uploaded (by Sbailey; author: Sbailey):

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a11

https://gerrit.wikimedia.org/r/742570

Change 742570 merged by jenkins-bot:

[mediawiki/vendor@master] Bump Parsoid to 0.15.0-a11

https://gerrit.wikimedia.org/r/742570

Change 737825 abandoned by Subramanya Sastry:

[mediawiki/services/parsoid@master] Strip annotations from template arguments

Reason:

We can revisit this at a later time. Abandoning for now.

https://gerrit.wikimedia.org/r/737825