Page MenuHomePhabricator

Machine translation fails for paragraphs with reference with images
Closed, ResolvedPublicBUG REPORT

Description

List of steps to reproduce (step by step, including full links if applicable):

Try to translate https://en.wikipedia.org/wiki/James_Webb_Space_Telescope to another language, say ig using Google translate. Click on the first paragraph.

What happens?:

Google MT fails, provides this stack trace in network debugger

Error: Invalid sourceHtml
    at Google.buildSourceDoc (/srv/service/lib/mt/MTClient.js:457:10)
    at Google.translateReducedHtml (/srv/service/lib/mt/MTClient.js:58:8)
    at Google.translate (/srv/service/lib/mt/Google.js:30:15)
    at MWImage.adapt (/srv/service/lib/translationunits/MWImage.js:78:64)
    at /srv/service/lib/lineardoc/TextBlock.js:458:32
    at Array.forEach (<anonymous>)
    at /srv/service/lib/lineardoc/TextBlock.js:446:9
    at Array.forEach (<anonymous>)
    at TextBlock.adapt (/srv/service/lib/lineardoc/TextBlock.js:443:19)
    at Doc.adapt (/srv/service/lib/lineardoc/Doc.js:774:23)
    at Adapter.adapt (/srv/service/lib/Adapter.js:25:27)
    at MWReference.adapt (/srv/service/lib/translationunits/MWReference.js:80:47)
    at Doc.adapt (/srv/service/lib/lineardoc/Doc.js:735:37)
    at /srv/service/lib/lineardoc/TextBlock.js:466:41
    at Array.forEach (<anonymous>)
    at TextBlock.adapt (/srv/service/lib/lineardoc/TextBlock.js:443:19)

What should have happened instead?:

Machine translation should succeed

Event Timeline

A pattern observed for this error is, the paragraph has reference which is a template and that it contains an image.

image.png (131×1 px, 76 KB)

Change 751676 had a related patch set uploaded (by Santhosh; author: Santhosh):

[mediawiki/services/cxserver@master] Support images generated by templates

https://gerrit.wikimedia.org/r/751676

Setting high priority since https://en.wikipedia.org/wiki/Template:Source-attribution or https://en.wikipedia.org/wiki/Template:PD-notice is used very frequently in references for English wiki. If any paragraph has such references, MT is failing for all MT engines.

Change 751676 merged by jenkins-bot:

[mediawiki/services/cxserver@master] Support images generated by templates

https://gerrit.wikimedia.org/r/751676

Change 758862 had a related patch set uploaded (by KartikMistry; author: KartikMistry):

[operations/deployment-charts@master] Update cxserver to 2022-02-01-141918-production

https://gerrit.wikimedia.org/r/758862

Change 758862 merged by jenkins-bot:

[operations/deployment-charts@master] Update cxserver to 2022-02-01-141918-production

https://gerrit.wikimedia.org/r/758862

Mentioned in SAL (#wikimedia-operations) [2022-02-01T15:13:34Z] <kart_> Deployed Flores MT for cxserver + Updated cxserver to 2022-01-13-174407-production (T298584, T292412, T292415, T298679, T298752) + Updated cxserver to 2022-02-01-141918-production (T298592)

There is no error in the console, but the references are not properly adapted. They are shown as empty references (rendered in grey) when they are not empty. This may cause publishing to skip them. The expected result would be for these references to be rendered in blue instead as regular references.

ig.wikipedia.org_wiki_Special_ContentTranslation_from=en&to=ig&page=James+Webb+Space+Telescope&targettitle=Telescopio+espacial+James+Webb(iPad Air).png (1×2 px, 693 KB)
ig.wikipedia.org_wiki_Special_ContentTranslation_from=en&to=ig&page=James+Webb+Space+Telescope&targettitle=Telescopio+espacial+James+Webb(iPad Air) copy.png (1×2 px, 218 KB)

This may be a regression of T225716, or a case not covered by it.

There is no error in the console, but the references are not properly adapted. They are shown as empty references (rendered in grey) when they are not empty. This may cause publishing to skip them. The expected result would be for these references to be rendered in blue instead as regular references.

I checked this and this is an unrelated issue: T301952: For references with multiple templates, the calculation of overall adaptation status is wrong