Page MenuHomePhabricator

Too much spacing in <math>\operatorname{arg\,max}</math> with MathJax
Open, HighPublicBUG REPORT

Description

Enable "Client side MathJax rendering" in Special:Preferences → Apperance → Math. Then preview a page with the contents <math>\operatorname{arg\,max}</math>. This should render similar to <math>\operatorname{argmax}</math> but with an extra space between "arg" and "max". Unfortunately, unwanted space is inserted between every letter, and it looks like "a r g  m a x" instead of "arg max".

This is because in the generated MathML every letter is put inside its own <mi> element:

<math xmlns="http://www.w3.org/1998/Math/MathML" class="mwe-math-element">
  <mrow data-mjx-texclass="ORD">
    <mstyle displaystyle="true" scriptlevel="0">
      <mrow data-mjx-texclass="ORD">
        <mi data-mjx-texclass="OP" mathvariant="normal">a</mi>
        <mi data-mjx-texclass="OP" mathvariant="normal">r</mi>
        <mi data-mjx-texclass="OP" mathvariant="normal">g</mi>
        <mspace width="0.167em"></mspace>
        <mi data-mjx-texclass="OP" mathvariant="normal">m</mi>
        <mi data-mjx-texclass="OP" mathvariant="normal">a</mi>
        <mi data-mjx-texclass="OP" mathvariant="normal">x</mi>
      </mrow>
    </mstyle>
  </mrow>
</math>

This does not happen with <math>\operatorname{argmax}</math>:

<math xmlns="http://www.w3.org/1998/Math/MathML" class="mwe-math-element">
  <mrow data-mjx-texclass="ORD">
    <mstyle displaystyle="true" scriptlevel="0">
      <mi mathvariant="normal">argmax</mi>
    </mstyle>
  </mrow>
</math>

I should stress that both look good when rendered with Firefox or Chromium. So this could actually be a bug in MathJax.

Rendering

Screenshot 2025-09-18 at 19.39.43.png (430×398 px, 20 KB)

Event Timeline

Physikerwelt subscribed.

I think it would be better anyhow if the output was

<math xmlns="http://www.w3.org/1998/Math/MathML" class="mwe-math-element">
  <mrow data-mjx-texclass="ORD">
    <mstyle displaystyle="true" scriptlevel="0">
      <mrow data-mjx-texclass="ORD">
        <mi data-mjx-texclass="OP" mathvariant="normal">arg</mi>
        <mspace width="0.167em"></mspace>
        <mi data-mjx-texclass="OP" mathvariant="normal">max</mi>
      </mrow>
    </mstyle>
  </mrow>
</math>

I'm not sure how easy it would be to implement this. For this particular case, this should be doable, but in general, it might be difficult, as one needs to differentiate between operatornames and invisible multiplication.

It would be easy to merge every sequences of alphabetical Literals inside \operatorname (and also inside \textit, \textrm, etc.), by modifying the squashLiterals method of TexArray. However it would not work for macros that returns text (like \gamma): for that, we would need to squash Literals after the macro expansion, so either we would need a new phase of compilation, or we must squash the resulted MathML (which would be difficult). But I think that this is not a problem, because greek letters and other non-standard letters inside \operatorname should be very rare, no ?

I do not see any other places where we would like to merge ab as a single identifier, so I don't think there would be possible confusions with invisible multiplications.

I wonder if we also want to merge numeric characters (I think that we should, if they appear inside a \operatorname they should be treated as part of a name, but I don't know the MathML spec enough), and there is also the case of '*.-/:. In TeX, the \operatorname command change their mathcode (it puts a space after dots and colons, - represents a hyphen instead of a minus sign, ' represents a typographic apostrophe instead of a prime, and there is no spacing around *), so maybe they should be treated as part of the identifier instead of operators so that they render "as text", but I don't really know what's the best way to do this.

Also, I think that the whole content of an \operatorname command should be putted inside a <mrow data-mjx-texclass="OP">, no ? This is what TeX does. Then, we would get:

<math xmlns="http://www.w3.org/1998/Math/MathML" class="mwe-math-element">
  <mrow data-mjx-texclass="ORD">
    <mstyle displaystyle="true" scriptlevel="0">
      <mrow data-mjx-texclass="ORD">
        <mrow data-mjx-texclass="OP">
          <mi data-mjx-texclass="OP" mathvariant="normal">arg</mi>
          <mspace width="0.167em"></mspace>
          <mi data-mjx-texclass="OP" mathvariant="normal">max</mi>
        </mrow>
      </mrow>
    </mstyle>
  </mrow>
</math>
Physikerwelt raised the priority of this task from Medium to High.Sep 18 2025, 5:40 PM
Physikerwelt updated the task description. (Show Details)

Wow, I never looked at the actual rendering. This looks terrible.

@JeanCASPAR I don't think that operator names should use Greek letters. Are you planning to work on this. Now, there is an alternative approach to solve this and replace squashLiterals with postprocessing that works on the newly introduced mathml tree.

@Physikerwelt I'm working on it, sorry for the delay.

mathoid generates

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block" alttext="\operatorname {arg\,max} ">
  <semantics>
    <mrow class="MJX-TeXAtom-OP MJX-fixedlimits">
      <mi mathvariant="normal">a</mi>
      <mi mathvariant="normal">r</mi>
      <mi mathvariant="normal">g</mi>
      <mspace width="thinmathspace"></mspace>
      <mi mathvariant="normal">m</mi>
      <mi mathvariant="normal">a</mi>
      <mi mathvariant="normal">x</mi>
    </mrow>
    <annotation encoding="application/x-tex">\operatorname {arg\,max}</annotation>
  </semantics>
</math>

However, the SVG looks good. Maybe it would be sufficient to remove data-mjx-texclass="OP"?

Change #1214455 had a related patch set uploaded (by JeanCASPAR; author: JeanCASPAR):

[mediawiki/extensions/Math@master] Group words in operatorname & put the content of \text* commands in a <mtext>

https://gerrit.wikimedia.org/r/1214455

I merged all sequences of <mi> which share the same style attributes.

@JeanCASPAR thank you. You propose a substantial change... much more than just removing data-mjx-texclass="OP"

I like the idea of moving the hbox functionality into the box class. However, this is a atomic change on its own, so I think it deserves its own commit. I would therefore suggest to split this into two commits

  1. Refactoring of hbox into box class (no functionality is changed, just code is being restructured)
  2. Replace squashLiterals with your improved implementation

before you submit to gerrit you can check your code locally by running composer test composer fix and composer phan . Also you can run the unit tests.
In addition you can create a change similar to that one https://gerrit.wikimedia.org/r/c/integration/config/+/725954 to be added to the allowlist so that jenkins runs the tests for the code you upload.