Error: Cannot serialize transclusion without data-mw.parts or data-parsoid.src.
Closed, ResolvedPublic

Description

Log:

Error: Cannot serialize transclusion without data-mw.parts or data-parsoid.src.
    at Object.handle (/srv/deployment/parsoid/deploy/src/lib/html2wt/DOMHandlers.js:1345:13)
    at /srv/deployment/parsoid/deploy/node_modules/prfun/lib/index.js:532:26
    at tryCatch2 (/srv/deployment/parsoid/deploy/node_modules/babybird/lib/promise.js:48:12)
    at PrFunPromise.Promise (/srv/deployment/parsoid/deploy/node_modules/babybird/lib/promise.js:458:15)
    at new PrFunPromise (/srv/deployment/parsoid/deploy/node_modules/prfun/lib/index.js:57:21)
    at /srv/deployment/parsoid/deploy/node_modules/prfun/lib/index.js:530:18
    at tryCatch1 (/srv/deployment/parsoid/deploy/node_modules/babybird/lib/promise.js:40:12)
    at promiseReactionJob (/srv/deployment/parsoid/deploy/node_modules/babybird/lib/promise.js:269:19)
    at PromiseReactionJobTask.call (/srv/deployment/parsoid/deploy/node_modules/babybird/lib/promise.js:284:3)
    at flush (/srv/deployment/parsoid/deploy/node_modules/babybird/node_modules/asap/raw.js:50:29)
    at process._tickCallback (node.js:415:13)

Also at: https://logstash.wikimedia.org/#/dashboard/temp/AVPSWxKHO3D718AOb5xH

Need to check if Content Translation is causing this, as user reported issue at, https://www.mediawiki.org/w/index.php?title=Topic:T0vj02va30kdelc2&topic_showPostId=t19ztek29e9f0aw4#flow-post-t19ztek29e9f0aw4

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptApr 1 2016, 3:17 PM
KartikMistry updated the task description. (Show Details)Apr 1 2016, 3:20 PM

I am fairly sure this is a CX issue.

There are also warnings which I think are related to this:
Parsoid id found on element without a matching data-parsoid entry: ID=mwBjY; ELT=<abbr class="abbr" id="mwBjY" title="page(s)">p.</abbr>

Parsoid id found on element without a matching data-parsoid entry: ID=mwAp4; ELT=<time class="nowrap date-lien" datetime="1828-02-05" id="mwAp4" contenteditable="false"><a class="cx-link cx-target-link" data-linkid="1492" href="5 de febrero" rel="mw:WikiLink" title="5 de febrero">5</a></time>

Arlolra moved this task from Backlog to Non-Parsoid Tasks on the Parsoid board.Apr 12 2016, 10:59 PM
Amire80 moved this task from Upcoming to CX9 on the ContentTranslation board.Apr 20 2016, 1:22 PM
Amire80 triaged this task as High priority.Apr 20 2016, 1:25 PM

106 instances in last 7 days, all from eswiki.

Nikerabbit moved this task from Backlog to In Progress on the Language-Q4-2016-Sprint 4 board.

As per my analysis the root cause for this is as follows: While adapting templates, the HTML elements need to update data-mw. While doing MT, we remove data-mw to save amount of data we send to MT engines. After MT, these HTML nodes will have typeof=mw:transclusion, but wont have any data-mw attribute. The template adaptation tool adds new data-mw after adapting template. Because of any reason, if the template adaptation fails, the typeof attribute will remain, while data-mw does not exist - resulting the above error from parsoid.

We are fixing one reason for this template adaptation failure at https://gerrit.wikimedia.org/r/292528

I found another cause for this:

When the article and translation is very big, when you restore, it takes a while to finish all template adaptations. The interface will tell you that translation restored and you may press the publish button. But the restore is not really done - template adaptation is waiting for API responses. At this point of time, the data-mw parts wont be ready on elements and can result in the above error. When I got this error, I waited for few more minutes and published again. It worked successfully.

santhosh added a comment.EditedJun 28 2016, 5:09 AM

I found a case while analysing parsoid error logs:
English to French translation of "Norbury, Derbyshire"

<p data-cx-weight="79" data-cx-mt-provider="Yandex" data-cx-state="mt" data-source="mwGQ" data-seqid="166" id="cxmwGQ">
<span class="cx-segment" data-segmentid="167">Le cimetière contient les tombes de Thias et Lisbeth Bède.
  <link about="#mwt12" href="./Category:All_articles_with_unsourced_statements" id="mwGg" rel="mw:PageProp/Category" typeof="mw:Transclusion">
  <link about="#mwt12" href="./Category:Articles_with_unsourced_statements_from_May_2008" rel="mw:PageProp/Category">
  <sup about="#mwt12" class="noprint Inline-Template Template-Fact" id="mwGw" style="white-space:nowrap;">   
    <span typeof="mw:Entity">[</span><i><a class="cx-link" data-linkid="168" href="//en.wikipedia.org/wiki/Wikipedia:Citation_needed" rel="mw:WikiLink" title="Wikipedia:Citation needed">   <span title="This claim needs references to reliable sources. (May 2008)">citation nécessaire</span></a></i><span typeof="mw:Entity">]</span>
  </sup>
</span>
</p>

In this, draft snippet, the first link tag has typeof="mw:Transclusion" without data-mw attribute.
This is a hidden category transculded by `citation needed' template in English wikipedia.

When I translated the same articlle, I could not reproduce the case of missing data-mw, every time, data-mw was present.

But, anyway, to address this issue permanently, I am thinking of the following fix:

The template tool(ext.cx.tools.template) that is responsible for template adaption must make sure that the typeof attribute is set only after the data-mw is set to the template. In the master, now we keep the typeof attribute and then do template adaptation. Template adaption may or may not set the data-mw, but typeof will remain. That should not happen.

santhosh moved this task from In Progress to Backlog on the Language-Q1-2016-17 Sprint 1 board.
santhosh moved this task from Backlog to In Progress on the Language-Q1-2016-17 Sprint 1 board.

The template tool(ext.cx.tools.template) that is responsible for template adaption must make sure that the typeof attribute is set only after the data-mw is set to the template.

This is done in https://gerrit.wikimedia.org/r/#/c/295898

Amire80 closed this task as Resolved.Sep 7 2016, 12:34 PM
Amire80 moved this task from In Progress to Done on the Language-Q1-2016-17 Sprint 3 board.