Page MenuHomePhabricator

CX2 shows </img> instead of math formulas
Closed, ResolvedPublic

Description

Since approximately this week, ContentTranslation v2 sometimes inserts </img> in the translation instead of the mathematical formulas, as shown in the following image:


(from https://pt.wikipedia.org/wiki/Special:ContentTranslation?title=Special:ContentTranslation&campaign=contributionsmenu&to=pt&page=Tensor+product&from=en&targettitle=Produto+tensorial&version=2)

Event Timeline

He7d3r created this task.Jan 20 2019, 9:56 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 20 2019, 9:56 AM

Using en:Grüneisen_parameter to translate into Serbian, in production:

  • With Google Translate as default option, translate some math formula. "</img>" is displayed as in the description.
  • Change to "Copy original content" and section gets empty
  • Make "Copy original content" the default option
  • Translate some other math formula. Result: Formula looks fine

Here we observe some different behavior depending on MT option chosen and even the order in which options are changed. Could be useful for debugging.

I've just managed to reproduce it.

I translated https://en.wikipedia.org/wiki/User:Amire80/math to https://ru.wikipedia.org/wiki/User:Amire80/math . In the CX2 interface, I saw </img>, and the output came out totally garbled.

Google Translate was enabled.

Amire80 added a comment.EditedJun 17 2019, 4:21 PM

A bit more details:

The source math was:

<math>\ddot{a}+\bar{a}</math>

The output was:

<math xmlns="http://www.w3.org/1998/Math/MathML"><mrow class="MJX-TeXAtom-ORD"><mstyle displaystyle="true" scriptlevel="0"><mrow class="MJX-TeXAtom-ORD"><mrow class="MJX-TeXAtom-ORD"><mover><mi> </mi><mo> <math>\ddot{a}+\bar{a}</math> </mo></mover></mrow></mrow><mo> <math>\ddot{a}+\bar{a}</math> </mo><mrow class="MJX-TeXAtom-ORD"><mrow class="MJX-TeXAtom-ORD"><mover><mi> </mi><mo stretchy="false"> <math>\ddot{a}+\bar{a}</math> </mo></mover></mrow></mrow></mstyle></mrow> </math><math>\ddot{a}+\bar{a}</math>  <math>\ddot{a}+\bar{a}</math> </img> <span></span>

You can see that <math>\ddot{a}+\bar{a}</math> appears five times in the output סֿ_Ô

Pginer-WMF triaged this task as Normal priority.Jun 18 2019, 7:07 AM
Pginer-WMF added a subscriber: Pginer-WMF.

Just a reminder that some comparative testing was done by @Barbvd in T137803. The pdf with the testing results (which includes this issue with </img>) is below:

Change 525264 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/services/cxserver@master] WIP: Do not pass Math content to MT engines

https://gerrit.wikimedia.org/r/525264

Change 525264 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Do not pass Math content to MT engines

https://gerrit.wikimedia.org/r/525264

Jpita added a subscriber: Jpita.Jul 29 2019, 5:57 PM

not sure this is related to this ticket, but I never saw a math formula being translated like this.
it is like this in production as well.
waiting for @santhosh feedback to see if it is related or not.

Jpita added a comment.Jul 29 2019, 6:11 PM

update: this might not be related since the example on my last screenshot is a template and not a math formula

Jpita added a comment.Jul 29 2019, 8:55 PM

@Pginer-WMF is this a different issue?
Is there a ticket already for this?

Change 526311 had a related patch set uploaded (by KartikMistry; owner: KartikMistry):
[operations/deployment-charts@master] Update cxserver to 2019-07-29-154005-production

https://gerrit.wikimedia.org/r/526311

not sure this is related to this ticket, but I never saw a math formula being translated like this.
it is like this in production as well.
waiting for @santhosh feedback to see if it is related or not.

Please include language pair, title and MT engine information along with screen shots.
In this case I assume es->ca translation, Title: Cálculo tensorial , MT service: Apertium. Correct me if I am wrong.

update: this might not be related since the example on my last screenshot is a template and not a math formula

You are right. This is not math formula, but https://es.wikipedia.org/wiki/Plantilla:Ecuaci%C3%B3n which is incorrectly connected to https://ca.wikipedia.org/wiki/Plantilla:Equaci%C3%B3/%C3%BAs - the documentation page for a template in catalan wikipedia. I tried a quick fixing it in wikidata(https://www.wikidata.org/wiki/Q14339252), but seems bit complex since catalan template has its own wikidata item https://www.wikidata.org/wiki/Q25740790. So it is better a catalan community member take look into this. @Pginer-WMF please note.

@Pginer-WMF is this a different issue?
Is there a ticket already for this?

This one is definitely a bug. Happening with Apertium. Checking.

Change 526311 merged by KartikMistry:
[operations/deployment-charts@master] Update cxserver to 2019-07-29-154005-production

https://gerrit.wikimedia.org/r/526311

Jpita added a comment.Jul 30 2019, 3:28 PM

This one is definitely a bug. Happening with Apertium. Checking.

@santhosh are you going to be fixing this issue on this task or a different one?

This one is definitely a bug. Happening with Apertium. Checking.

@santhosh are you going to be fixing this issue on this task or a different one?

This ticket is sufficient.

Change 526644 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/services/cxserver@master] Support inline maths for plain text MT services

https://gerrit.wikimedia.org/r/526644

Change 527081 had a related patch set uploaded (by Santhosh; owner: Santhosh):
[mediawiki/services/cxserver@master] Do not send math content to HTML MT services

https://gerrit.wikimedia.org/r/527081

Jpita added a comment.Aug 1 2019, 12:42 PM

QA NOTES (dev can ignore):
I found a weird behaviour in prod with https://es.wikipedia.org/w/index.php?title=Especial:Traducci%C3%B3n_de_contenidos&campaign=contributions-page&page=Gr%C3%BCneisen+parameter&from=en&to=es&targettitle=Gr%C3%BCneisen+parameter


MT stops translating after a math formula.
Check this case once this task is in cx2-testing.

Change 526644 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Support inline maths for plain text MT services

https://gerrit.wikimedia.org/r/526644

Change 527081 merged by jenkins-bot:
[mediawiki/services/cxserver@master] Do not send math content to HTML MT services

https://gerrit.wikimedia.org/r/527081

Jpita added a comment.Aug 8 2019, 10:12 PM

MT stops translating "randomly".
Steps:

  1. start translating the article from the description
  2. using google has the MT engine, start translating one by one from the top of the article
  3. as seen in the screenshot, the third section does not get translated. this happens again if you continue translating.

an error appears on the console although I'm not sure it is related


mw.cx.TranslationTracker.prototype.isExcludedFromValidation = function(sectionModel) {
        var excludedTypes = ['cxBlockImage', 'mwBlockImage', 'cxTransclusionBlock', 'mwTransclusionBlock', 'mwReferencesList', 'mwMath', 'mwTable', 'list', 'mwHeading']
          , childType = sectionModel.getChildNodeName();
}
Jpita added a comment.Aug 9 2019, 12:47 PM

created a new ticket for the issue T230195

Jpita closed this task as Resolved.Aug 12 2019, 5:09 PM