Page MenuHomePhabricator

MathML tags are missing xmlns attribute
Closed, ResolvedPublic

Description

Ex: enwiki:Light, enwiki:MathML

----- JS:[34267, 34347] -----
<math xmlns="http://www.w3.org/1998/Math/MathML" alttext="{\displaystyle x+5}">

+++++ PHP:[34181, 34218] +++++
<math alttext="{\displaystyle x+5}">

Details

Event Timeline

ssastry triaged this task as Medium priority.Oct 11 2019, 6:35 PM
ssastry created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 11 2019, 6:35 PM

RemexHTML seems to be dropping the xmlns attribute (see output when I added a log output to ExtensionHandler.php)

[subbu@earth:~/work/wmf/parsoid] echo '<math>1</math>' | php bin/parse.php --body_only --dump extoutput
HTML STRING: <span class="mwe-math-element"><span class="mwe-math-mathml-inline mwe-math-mathml-a11y" style="display: none;"><math xmlns="http://www.w3.org/1998/Math/MathML"  alttext="{\displaystyle 1}">
  <semantics>
    <mrow class="MJX-TeXAtom-ORD">
      <mstyle displaystyle="true" scriptlevel="0">
        <mn>1</mn>
      </mstyle>
    </mrow>
    <annotation encoding="application/x-tex">{\displaystyle 1}</annotation>
  </semantics>
</math></span><img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/92d98b82a3778f043108d4e20960a9193df57cbf" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -0.338ex; width:1.162ex; height:2.176ex;" alt="1"/></span>
[dump/extoutput] ================================================================================
[dump/extoutput] EXTENSION INPUT: <math>1</math>
[dump/extoutput] ================================================================================
[dump/extoutput] EXTENSION OUTPUT (outerHTML of body of parsed html doc): 
[dump/extoutput] <body><span class="mwe-math-element"><span class="mwe-math-mathml-inline mwe-math-mathml-a11y" style="display: none;"><math alttext="{\displaystyle 1}">   <semantics>     <mrow class="MJX-TeXAtom-ORD">       <mstyle displaystyle="true" scriptlevel="0">         <mn>1</mn>       </mstyle>     </mrow>     <annotation encoding="application/x-tex">{\displaystyle 1}</annotation>   </semantics> </math></span><img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/92d98b82a3778f043108d4e20960a9193df57cbf" class="mwe-math-fallback-image-inline" aria-hidden="true" style="vertical-align: -0.338ex; width:1.162ex; height:2.176ex;" alt="1"/></span></body>
[dump/extoutput] --------------------------------------------------------------------------------
ssastry assigned this task to cscott.Oct 15 2019, 10:39 PM
ssastry raised the priority of this task from Medium to High.Nov 5 2019, 4:53 PM
cscott added a comment.Nov 5 2019, 7:26 PM

I bet the culprit is the workaround for T217708 (Remex commit 33de7ba9746fce0aaaeb9314a7a78460f2a28122), although that was *supposed* to affect only HTML elements, not xmlns="http://www.w3.org/1998/Math/MathML".

cscott added a comment.Nov 5 2019, 7:54 PM

Hm. Apparently the PHP DOM just ignores setAttribute and setAttributeNS when the name is xmlns. This is regardless of whether the element is created with createElement or createElementNS. I haven't figured out a workaround yet.

Grumble grumble PHP DOM grumble grumble.

Change 548913 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/libs/RemexHtml@master] WIP: Hack Remex to work around PHP DOM "special case" of xmlns attributes

https://gerrit.wikimedia.org/r/548913

Change 548937 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Workaround for missing xmlns attributes on DOMElement

https://gerrit.wikimedia.org/r/548937

Change 548913 abandoned by C. Scott Ananian:
WIP: Hack Remex to work around PHP DOM "special case" of xmlns attributes

Reason:
Abandoned in favor of I7ef44ad9c1996749e9cf4306ac4d89cc5b8cc6e0

https://gerrit.wikimedia.org/r/548913

Change 548937 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Workaround for missing xmlns attributes on DOMElement

https://gerrit.wikimedia.org/r/548937

Change 549132 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Remove T235295 normalization from html diffing script

https://gerrit.wikimedia.org/r/549132

ssastry closed this task as Resolved.Nov 6 2019, 9:02 PM

Change 549132 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Remove T235295 normalization from html diffing script

https://gerrit.wikimedia.org/r/549132

Change 549619 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Put the 'fake' xmlns attribute first, in an attempt to better match JS

https://gerrit.wikimedia.org/r/549619

Change 549655 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Reorder attributes in Parsoid/JS to put xmlns first

https://gerrit.wikimedia.org/r/549655

Change 549619 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Put the 'fake' xmlns attribute first, in an attempt to better match JS

https://gerrit.wikimedia.org/r/549619