Page MenuHomePhabricator

CX2: "citation needed" template adapted with unnecessary HTML markup and VE attributes
Open, MediumPublic

Description

When transfering the "citation needed" template, the published results has several problems:

  • Translated template is added but with untranslated date (see T196990)
  • HTML syntax in kept in addition with the template
  • VE attributes are kept in the result

For example, when translating Snow Leopard award from English to French (original translation of "Prix Léopard des Neiges" ) the "citation needed" template results in the following content (in bold the unexpected content):

{{Référence nécessaire|date=October 2015}}<sup class="noprint Inline-Template Template-Fact" data-ve-ignore="true" style="white-space:nowrap;">&#x5B; ''[[Aide:Référence nécessaire|<span title="This claim needs references to reliable sources. (October 2015)">citation nécessaire</span>]]'' &#x5D;</sup>

Rendering in the following way:

Screenshot 2019-01-09 at 13.06.01.png (49×707 px, 20 KB)


Support for translating dates automatically (note that in the example "October" is just copied over) is covered in T196990: CX2: Suport the adaptation of dates in templates

Event Timeline

Pginer-WMF renamed this task from CX2: Handles poorly "citation needed" to CX2: "citation needed" template adapted with unnecessary HTML markup and VE attributes.Jan 9 2019, 12:08 PM
Pginer-WMF triaged this task as Medium priority.
Pginer-WMF updated the task description. (Show Details)

Thanks for the detailed report and specific example, @NicoV. I updated the description to focus the ticket on the issues of the unnecessary HTML tags and VE parameters, leaving the date translation aside since it is covered by a separate ticket already (T196990).

Machine translation of that section from Google is

<section
    class="ve-ce-branchNode ve-ce-activeNode ve-ce-sectionNode ve-ce-cxLintableNode ve-ce-cxSectionNode ve-ce-activeNode-active mw-cx-lintIssue-warning"
    contenteditable="true" spellcheck="true" id="cxTargetSection4" rel="cx:Section"
    style="margin-top: 0px; height: 116px;">
    <p id="mwIA" class="ve-ce-branchNode ve-ce-contentBranchNode ve-ce-paragraphNode">
        <span class="cx-segment ve-ce-annotation ve-ce-cxSentenceSegmentAnnotation" data-segmentid="24">En ordre de
            difficulté, le pic Pobeda est de loin le plus difficile et le plus dangereux, suivi de Khan Tengri,, du pic
            Ismail Samani, du pic Korzhenevskaya et du pic Lénine (Ibn Sina).
            <sup class="need_ref_tag ve-ce-leafNode ve-ce-mwTransclusionNode ve-ce-focusableNode ve-ce-focusableNode-focused"
                style="padding-left:2px;" about="#mwt1" typeof="mw:Transclusion"
                data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;Référence nécessaire&quot;,&quot;href&quot;:&quot;./Modèle:Référence_nécessaire&quot;},&quot;params&quot;:{&quot;date&quot;:{&quot;wt&quot;:&quot;October 2015&quot;}},&quot;i&quot;:0}}]}"
                id="mwAg" contenteditable="false">
                <a rel="mw:WikiLink" href="https://wiki.thottingal.in/wiki/Aide:R%C3%A9f%C3%A9rence_n%C3%A9cessaire"
                    title="Aide:Référence nécessaire">[réf.<span typeof="mw:Entity">&nbsp;</span>nécessaire]</a>
            </sup>
            <link rel="mw:PageProp/Category"
                href="https://wiki.thottingal.in/wiki/Cat%C3%A9gorie:Article_%C3%A0_r%C3%A9f%C3%A9rence_n%C3%A9cessaire"
                about="#mwt1"
                class="ve-ce-leafNode ve-ce-mwTransclusionNode ve-ce-focusableNode ve-ce-focusableNode-focused"
                contenteditable="false">
            <span typeof="mw:Nowiki" about="#mwt1"
                class="ve-ce-leafNode ve-ce-mwTransclusionNode ve-ce-focusableNode ve-ce-focusableNode-focused"
                contenteditable="false"></span>
        </span>
    </p>
</section>

When VE convert this to HTML for publishing or saving to corpora, we get the following content. This is the output of ve.dm.converter.getDomFromNode( sectionModel)

<section rel="cx:Section" id="cxTargetSection4" data-mw-cx-source="Google">
    <p id="mwIA">
        <span data-segmentid="24" class="cx-segment">En ordre de difficulté, le pic Pobeda est de loin le plus difficile
            et le plus dangereux, suivi de Khan Tengri,, du pic Ismail Samani, du pic Korzhenevskaya et du pic Lénine
            (Ibn Sina).
            <span about="#mwt12"
                data-cx="[{&quot;adapted&quot;:true,&quot;partial&quot;:false,&quot;targetExists&quot;:true}]"
                data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;Référence nécessaire&quot;,&quot;href&quot;:&quot;./Modèle:Référence nécessaire&quot;},&quot;params&quot;:{&quot;date&quot;:{&quot;wt&quot;:&quot;October 2015&quot;}},&quot;i&quot;:0}}]}"
                data-ve-no-generated-contents="true" href="./Category:All_articles_with_unsourced_statements" id="mwIQ"
                rel="mw:PageProp/Category" typeof="mw:Transclusion">&nbsp;</span>
        </span>
    </p>

    <span class="cx-segment" data-segmentid="24"><sup about="#mwt12" class="noprint Inline-Template Template-Fact"
            data-ve-ignore="true" id="mwIg" style="white-space:nowrap;">
            <span typeof="mw:Entity">[</span>
            <i><a class="cx-link"
                    data-cx="{&quot;adapted&quot;:true,&quot;targetTitle&quot;:{&quot;title&quot;:&quot;Aide:Référence nécessaire&quot;,&quot;pagelanguage&quot;:&quot;fr&quot;},&quot;sourceTitle&quot;:{&quot;title&quot;:&quot;Wikipedia:Citation needed&quot;,&quot;pagelanguage&quot;:&quot;en&quot;,&quot;description&quot;:&quot;when a citation is needed to reference the specific statement&quot;}}"
                    data-linkid="25" href="Aide:Référence nécessaire" rel="mw:WikiLink"
                    title="Aide:Référence nécessaire">
                    <span title="This claim needs references to reliable sources. (October 2015)">citation
                        nécessaire</span>
                </a>
            </i>
            <span typeof="mw:Entity">]</span>
        </sup>
    </span>
</section>

Obviously the output from ve.dm.converter.getDomFromNode is problematic since it duplicates the <span data-segmentid="24" and places the sup tag inside it.

@Esanders, I am aware that VE fetches the template rendering from target wiki and that is the reason for "Inline-Template Template-Fact" elements. But it seems the DOM tree has affected by it, causing the sentence segment span duplication. Is this a glitch in VE or should CX do something to prevent this kind scenario?

Template renderings can consist of multiple sibling nodes which Parsoid groups together using the about attribute, for example:

<span typeof="mw:Transclusion" about="#mwt5">Foo</span><span about="#mwt5">bar</span>

is treated as one template item by VE. It appears that this is what has happened here (there are two elements with about="#mwt12") but the siblings have been separated and about-grouping only works with immediate siblings. I've not see this bug in plain VE, so I assume this is something that CX is doing.

Will reassign when I return to this ticket.

Any ideas why/where the separation of sibling is happening? The siblings are not separated in the processed output of the section. It looks like as if something pushes the sup part outside of the p and the rewrapped it inside span.cx-segment.