Page MenuHomePhabricator

Performance review of MathJax client-side mode
Open, Needs TriagePublic

Description

[…] we now aim to enable the MathJax client-side enhancement by default on top of the native MathML <math> mode, as implemented in MediaWiki PHP. […]
MathJax has gotten significantly smaller and faster in recent releases, making widescale deployment on Wikipedia likely feasible.
There has been a major upgrade since last time [that MathJax was reviewed], and the standard [to review for] is now for all users rather than opt-in.

Change merged: Update MathJax to version 4.0.0
https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Math/+/1201204

Event Timeline

Test setup:

  • latest master of MediaWiki and Math extension.
  • default configuration.
  • logged-in
  • change math preference from default "native" (MathML) to MathJax (MathML+MathJax).

Measuring decompressed payload size via mw.inspect() on the console.

This measures decompressed size of the source code which the browser has to parse and compile, which is a proxy for compilation time and runtime complexity, which matters more nowadays than transfer download size, per Frontend performance guidelines).

  • MathJax v3: 1,136 KiB (1163404 bytes). This is before T409110, using git checkout 8e55d641b5558ccc6ec6b8d3e706f847c7f37c29^
  • MathHJax v4: 940 KiB (963330 bytes). This is on latest master.

That's 17% smaller. Significant progress, but still significantly larger than I think we'd want to eagerly load on a logged-out pageview to render a math formula. I was expecting a payload closer to ~100 KiB (30 KiB after gzip).

@Physikerwelt Is this the mode I should review, or is there a smaller build focussed for our needs?

I notice that with the above setup, MathJax offers an extensive UI on right-click with many non-default features. I see that some of that is lazy-loaded (e.g. the language files). Is most of the bundle really needed for the default rendering?

mwinspect-mathjax-v3.png (769×1 px, 79 KB)

mwinspect-mathjax-v4.png (1×1 px, 258 KB)

@Krinkle, there is nothing else. I recall that previous versions of MathJax used to measure around 10 MB when I was using it, as they downloaded large fonts. This is the major improvement I was referring to. I suggest asking upstream what size is expected to understand if this comes from our resource loader or is a problem with MathJax. What you can do right now is pass debug=true, and then you will see which files are actually used.

Screenshot 2026-01-08 at 13.33.48.png (724×1 px, 90 KB)

Here some ideas

Screenshot 2026-01-08 at 13.38.14.png (642×4 px, 354 KB)

I am a bit surprised that a cdn is used. I don't recall seeing that before, and I am a bit concerned that this raises a privacy issue.

@Physikerwelt Once T414994 is done, we'll be lazyloading as much as we can. I believe that includes deferring until the math formula is on-screen, is that right?

I was chatting with @IBerker-WMF last week, and he suggested that using complexity as a threshold. I wonder if that would work, server-side?

This way, if <math> is used inline for one or two symbols or a very short/simple expression within the lead paragraph, the default would be mathml. But if the expression is more complex, the default would be mathml+mathjax, on the assumption that these would render identical or without objectional differences.

In either case, the page as a whole would be consistent, presumably, because if at least one math formula on the page queues the JS payload, then it will enhance all <math> elements equally, would it would mean that perhaps on most pages it wouldn't be loaded until/unless needed.

Does this seem feasible? What would it take to plot a distribution of some measure of complexity (i.e. based on the most complex formula on a given page). If it turns out that a majority are below a low threshold, that might be viable.

@Physikerwelt Once T414994 is done, we'll be lazyloading as much as we can. I believe that includes deferring until the math formula is on-screen, is that right?

I was not able to bring it to work with our quite extensive config, so it's hard to tell what would happen if https://docs.mathjax.org/en/v4.0/output/lazy.html would work. There is some pre-fetching that it loads 200px prior to becoming visible, I believe.

I was chatting with @IBerker-WMF last week, and he suggested that using complexity as a threshold. I wonder if that would work, server-side?

This way, if <math> is used inline for one or two symbols or a very short/simple expression within the lead paragraph, the default would be mathml. But if the expression is more complex, the default would be mathml+mathjax, on the assumption that these would render identical or without objectional differences.

I don't think it would neither be easy nor save a lot. Even in the current setup, the initial load is quite heavy. I measured 300kb inital load in lazy load compared to 600kb for the entire rest of mediawiki (not only js).

In either case, the page as a whole would be consistent, presumably, because if at least one math formula on the page queues the JS payload, then it will enhance all <math> elements equally, would it would mean that perhaps on most pages it wouldn't be loaded until/unless needed.

Does this seem feasible? What would it take to plot a distribution of some measure of complexity (i.e. based on the most complex formula on a given page). If it turns out that a majority are below a low threshold, that might be viable.

I am a bit skeptical. In that case I would rather go with MathML and invest the time in some JS polyfills to improve spacing etc. We can copy from https://github.com/w3c/mathml-polyfills to fill in some gaps. @Jdlrobson pointed us the notion of skip function which makes those polyfills a standard mediawiki thing and not a solution that is special for the math extension and that nobody can maintain. In general, it might be more effective having a zoom session to discuss.