Page MenuHomePhabricator

Create a bot that replaces :<math with <math display=block
Open, Needs TriagePublic

Description

It seems that there is a consensus that it is a good idea to create a bot that replaces

:<math

with

<math display=block

Event Timeline

I think this is a good idea. See also recent changes to the en.wikipedia manual of style for mathematics formatting, https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Mathematics#Using_LaTeX_markup, that recommended display=block over :-indentation.

Note however that this should only be done when the ONLY content of the indented line is a single math block. There will be some instances where a displayed formula drops out of math to add some additional text on the same line, and those will probably need a human editor to figure out what do do with them.

Maybe a maintenance template would be good for those so it is easy to find them.

I was about to write that there's a case where additional text on the same line can be handled automatically, namely when it's just punctuation. Putting a comma or a period outside the </math> tag is a fairly common typographical error which feels like it should be trivially fixable. But there are cases where that's wrong, like when the <math> tag ends with \end{align}. And there are cases which require editing the contents of the <math> tag, like when someone ended the formula with explicit whitespace like \, and \! to avoid HTML output. So really I think that this bot should:

  1. Find lines starting with :, followed by any amount of whitespace, followed by <math>. (Yes, there are articles with whitespace there; I've never understood why.)
  2. If there is a single <math> ... </math> block on that line, and if the line contains only whitespace after the </math>, convert the initial :<math> to <math display="block">
  3. Otherwise, tag the line with a maintenance template.

Sounds good to me. Will this be applied to enwiki or to all wikis, i.e., there are quite some math uses in wikiversity.

Sounds good to me. Will this be applied to enwiki or to all wikis, i.e., there are quite some math uses in wikiversity.

I think it would make sense to do it everywhere +- local consensus.

I'm now running my script to look for awkward cases.

In https://en.wikipedia.org/wiki/Aristotle we have

:: <math>F=mv</math>,

with a double indent and a trailing comma.

In https://en.wikipedia.org/wiki/Arithmetic_mean

:<math>A=\frac{1}{n}\sum_{i=1}^n a_i=\frac{a_1+a_2+\cdots+a_n}{n}</math><ref>{{Cite web|last=Weisstein|first=Eric W.|title=Arithmetic Mean|url=https://mathworld.wolfram.com/ArithmeticMean.html|access-date=2020-08-21|website=mathworld.wolfram.com|language=en}}</ref>

we have complex ref on the same line after the formula.

https://en.wikipedia.org/wiki/Talk:M%C3%B6bius_function has various levels of indentation for the math formula with some quite deeply nested

:::<math>\sum_{k=1}^n \mu(k)=M(n).</math>

This is not just talk pages. https://en.wikipedia.org/wiki/Haar_measure has

* Left translate:
::<math> g S = \{g\cdot s\,:\,s \in S\}.</math>
* Right translate:
::<math> S g = \{s\cdot g\,:\,s \in S\}.</math>