Page MenuHomePhabricator

Cite tags are repeated in the MT output
Closed, ResolvedPublic

Description

Repeated references in machine translation output

Input:


<cite about="#mwt9" class="citation web" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;cite web &quot;,&quot;href&quot;:&quot;./Template:Cite_web&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html&quot;},&quot;title&quot;:{&quot;wt&quot;:&quot;Thurman, Uma&quot;},&quot;publisher&quot;:{&quot;wt&quot;:&quot;FilmReference.com&quot;},&quot;accessdate&quot;:{&quot;wt&quot;:&quot;April 22, 2014&quot;}},&quot;i&quot;:0}}]}"
    id="mwBWE" typeof="mw:Transclusion" data-ve-no-generated-contents="true">
    <a class="external text" href="http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html" id="mwBWI" rel="mw:ExtLink">"Thurman, Uma"</a>. FilmReference.com
    <span class="reference-accessdate" id="mwBWM">. Retrieved
        <span class="nowrap" id="mwBWQ">April 22,</span> 2014</span>.</cite>
<span about="#mwt9" class="Z3988" id="mwBWU" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&amp;rft.genre=unknown&amp;rft.btitle=Thurman%2C+Uma&amp;rft.pub=FilmReference.com&amp;rft_id=http%3A%2F%2Fwww.filmreference.com%2FActors-and-Actresses-Str-Us%2FThurman-Uma.html&amp;rfr_id=info%3Asid%2Fen.wikipedia.org%3AUma+Thurman"
    data-ve-ignore="true">
    <span id="mwBWY" style="display:none;">
        <span id="mwBWc" typeof="mw:Entity">&nbsp;</span>
    </span>
</span>

Output


"
<cite typeof="mw:Transclusion" id="mwBWE" data-ve-no-generated-contents="true" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;cite web &quot;,&quot;href&quot;:&quot;./Template:Cite_web&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html&quot;},&quot;title&quot;:{&quot;wt&quot;:&quot;Thurman, Uma&quot;},&quot;publisher&quot;:{&quot;wt&quot;:&quot;FilmReference.com&quot;},&quot;accessdate&quot;:{&quot;wt&quot;:&quot;April 22, 2014&quot;}},&quot;i&quot;:0}}]}"
    class="citation web" about="#mwt9">
    <a rel="mw:ExtLink" id="mwBWI" href="http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html" class="external text">Thurman</a>
</cite>, Uma". FilmReference.com.
<cite typeof="mw:Transclusion" id="mwBWE" data-ve-no-generated-contents="true" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;cite web &quot;,&quot;href&quot;:&quot;./Template:Cite_web&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html&quot;},&quot;title&quot;:{&quot;wt&quot;:&quot;Thurman, Uma&quot;},&quot;publisher&quot;:{&quot;wt&quot;:&quot;FilmReference.com&quot;},&quot;accessdate&quot;:{&quot;wt&quot;:&quot;April 22, 2014&quot;}},&quot;i&quot;:0}}]}"
    class="citation web" about="#mwt9">
    <span id="mwBWM" class="reference-accessdate">
        <span id="mwBWQ" class="nowrap">April</span>
    </span>
</cite> recuperada 22,
<cite typeof="mw:Transclusion" id="mwBWE" data-ve-no-generated-contents="true" data-mw="{&quot;parts&quot;:[{&quot;template&quot;:{&quot;target&quot;:{&quot;wt&quot;:&quot;cite web &quot;,&quot;href&quot;:&quot;./Template:Cite_web&quot;},&quot;params&quot;:{&quot;url&quot;:{&quot;wt&quot;:&quot;http://www.filmreference.com/Actors-and-Actresses-Str-Us/Thurman-Uma.html&quot;},&quot;title&quot;:{&quot;wt&quot;:&quot;Thurman, Uma&quot;},&quot;publisher&quot;:{&quot;wt&quot;:&quot;FilmReference.com&quot;},&quot;accessdate&quot;:{&quot;wt&quot;:&quot;April 22, 2014&quot;}},&quot;i&quot;:0}}]}"
    class="citation web" about="#mwt9">
    <span id="mwBWM" class="reference-accessdate">2014.</span>
</cite>&nbsp;

from en to ca, using Apertium

Event Timeline

santhosh created this task.Jul 30 2018, 7:37 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJul 30 2018, 7:37 AM

Change 449167 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/services/cxserver@master] WIP: Support adapting Cite web,journal,book.. templates in references

https://gerrit.wikimedia.org/r/449167

Pginer-WMF triaged this task as Normal priority.Jul 31 2018, 7:04 AM

Change 449167 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Support adapting Cite web, journal, book.. templates in references

https://gerrit.wikimedia.org/r/449167

Petar.petkovic removed a project: Patch-For-Review.
Petar.petkovic removed a subscriber: gerritbot.

Mentioned in SAL (#wikimedia-operations) [2018-08-08T06:17:16Z] <kartik@deploy1001> Finished deploy [cxserver/deploy@6a0cab1]: Update cxserver to 951fdba (T199308, T199512, T199320, T200665, T200453, T106437) (duration: 03m 32s)

Attempting to translate a references section consistently produces 500 (Internal Server Error), e.g.

POST http://cxserver.wmflabs.org/v2/translate/en/es/ 500 (Internal Server Error)

{status: 500, type: "internal_error", title: "PayloadTooLargeError",…}
detail: "request entity too large"
method: "POST"
status: 500
title: "PayloadTooLargeError"
type: "internal_error"
uri: "/v2/translate/en/ca/Apertium"
Petar.petkovic added a subscriber: Petar.petkovic.

Attempting to translate a references section consistently produces 500 (Internal Server Error), e.g.

Already reported at T202283.

Etonkovidova added a comment.EditedAug 25 2018, 12:20 AM

The error is different from T202283 -
The error appears if you click to translate References, without translating any of the content, e.g.click to translate References in Uma 'Thurman' en->es.

TitleError: title-invalid-characters
    at _checkLegalTitleCharacters (/cxserver/node_modules/mediawiki-title/lib/index.js:253:15)
    at Function.Title.newFromText (/cxserver/node_modules/mediawiki-title/lib/index.js:402:5)
    at TitlePairRequest.<anonymous> (/cxserver/lib/mw/ApiRequest.js:264:19)
    at next (native)
    at resume (/cxserver/lib/util.js:277:21)
    at resumeNext (/cxserver/lib/util.js:287:26)

Since the fix refers to Reference section, then I'm waiting on fixing Reference translation.

The error is different from T202283

TitleError: title-invalid-characters is reported at T189438 and PayloadTooLargeError is reported at T202283.

Etonkovidova closed this task as Resolved.Sep 3 2018, 6:26 PM
Petar.petkovic removed a subscriber: Stashbot.