User Details
- User Since
- Nov 1 2015, 12:12 PM (552 w, 5 d)
- Availability
- Available
- LDAP User
- Debenben
- MediaWiki User
- Debenben [ Global Accounts ]
Mar 29 2025
Nov 28 2020
I do not think it is a good idea to take well known syntax like \[ for the same reason as the other suggestion (searching for \begin{align} \begin{equation} etc.) to influence the rendering outside of the math tags. It will never have the same functionality like in LaTeX (switching between text and math mode), therefore it would make the current behavior even more confusing to the editors.
Feb 19 2020
Regarding the breaking changes, we have:
(1) version histories
(2) current pages where math would not render
(3) current pages where math would render differently
Feb 18 2020
Spacing in LaTeX is fine, spacing in Mathjax also. We actually do not need to fix spacing, we just have to ensure that the source code that we pass to Mathjax is still correct. The spacing is just one effect those brackets can have, in principle they can change everything, for example "\sin\limits_a^b" is correct, "{\sin}\limits_a^b" does not compile.
The list of functions where the spacing gets wrong with additional braces is everything that is defined like a unary or binary operator because those get a spacing of \medmuskip. If you place them in braces they become a subformula with spacing rules of an ordinary math expression. By default you have block layout in LaTeX where the size of all spaces between letters and words depends on how much needs to fit into a particular line.
Feb 17 2020
I guess most people know, but just to clarify what I mean with "{\sqrt[{b}]{c}}" not being a real error: In LaTeX it depends on the definition of \sqrt and where it is placed. With the normal definition it removes extra spaces around \sqrt which in this case is not a problem because there would be no extra spacing anyway. In other situations, e.g. for relations like "a\xrightarrow\alpha b" or "a\stackrel\alpha= b" you still get wrong spacing in Wikipedia due to those extra braces.
For the more complex example "\sqrt{[} ..." is a real error. In the simple example "{\sqrt[{b}]{c}}" is only hard to read, not a real error. Still, if the example is more complex, keeping spaces and newlines is even more important. For comparison: In C++ you can also remove newlines, scramble white-spaces, add unnecessary scopes and rename variables such that the output is still correct, but not human-readable anymore.
Feb 15 2020
I lost you. Are you saying it should fail or it should not?
yes, it should fail and if it renders it should render different.
@Physikerwelt \Omicron being upright is fine, see https://tex.stackexchange.com/questions/119248.
Feb 10 2020
@Physikerwelt Thank you very much for creating the pull request, it is definitely a very good fist step in the right direction. I guess nobody really cares about the name of the category, we could simply use "Pages that use a deprecated format of the math tags" in analogy to the chem tracking category. I am really happy with the pull request and surprised to see how much work this was but at the same time also a bit disappointed because I was hoping for more.
Feb 8 2020
Replacing \exist with \exists sounds reasonable, it should be no problem. My database dump search found 1302 pages in all projects with \exist. Without the texvc changes there might be other things creating problems which we are not aware of and could be fixed with a bot, so I would like to wait for the error categories such that we do not need to edit pages twice.
Dec 8 2019
@Physikerwelt Thank you for creating this ticket, I think this change is long overdue. What are your thoughts regarding the implementation, in particular:
- does the new rendering mode still need this restbase setup? I am just asking because getting rid of this would greatly reduce the complexity and dependencies, would make it much easier for people to contribute to the development of the math extension and install it on their own servers.
- as far as I know the legacy rendering mode for mhchem package is no longer supported in mathjax 3.0. In my opinion this should not block the transition to the new rendering mode because this change is also long overdue, we just have to keep in mind that it will break some mhchem formulas. I can take care of this and make sure that they will get fixed.
- the new rendering mode should conserve the original LaTeX input, i.e. not have something like texvcjs (T188879). The necessary syntax modfications are done (T197925), except for those that we could not extract from the database dumps. If we want to fix those pages we need the new rendering mode assigning error categories in real time.
- we should take the new rendering mode as an opportunity to discuss getting rid of "default" rendering mode and making the "default" without parameters "inline" (textstyle).
Jul 20 2019
@Physikerwelt what do you mean with "exist as alias for exist"?
Jun 26 2019
It is easy to perform some sanitization or conversion on the original string if you need it.
May 4 2019
@GregorAlexandru We are actually planning to get rid of the texvcjs so that in the future we can just refer to the MathJax documentation.
Dec 27 2018
I put some thoughts into the problem of replacing math generated by templates like in https://de.wikiversity.org/wiki/Kommutative_Ringtheorie/Algebra-Homomorphismus_%C3%BCber_Ring/Definition where <math>K</math> is produced with a source code
Dec 21 2018
@SalixAlba We have to change our algorithms to look for math tags. dewikiversity is using {{#tag: math}} via https://de.wikiversity.org/wiki/Vorlage:Math on more than 18 000 pages.
Dec 11 2018
Addendum: Another reason for a block formula is of course if the formula requires too much vertical space to fit into a normally spaced line
@Izno I do not know what the best way of implementing it in the extension is, but from an author point of view both are treated equally as part of a sentence and there is no difference between inline and block formula except for the size of the equation.
Dec 7 2018
Or with T209148 in mind, why not disable lazy loading for math "images" completely? After all they are not images, but part of the text.
If a symbol people need is missing or looks ugly, they will try to rewrite equations to make it work without the symbol or use some workarounds, e.g. in this case probably something like \int_C and then specify what C means. Those line integrals are common in complex analysis (residue theorem) and hence for example also in electrodynamics, I would estimate the number of articles where they could be used to more than 100 on enwiki.
I think people do not want to use \oiint or \oiiint because the integral symbol looks bad, they will prefer workarounds if they look better in their setup. It is probably not the biggest problem, but I would only consider the integrals in T182127 done, if they actually produce a rendering that is not ugly. I will close this task because I think it does not help.
Nov 29 2018
Nov 16 2018
@SalixAlba I pushed a commit that should fix it
Nov 14 2018
@SalixAlba thanks for catching that. This is something we overlooked:
Nov 11 2018
We really need an option that covers all use-cases (especially because you have to login to get to the options) but currently we don't and these options are very helpful for debugging e.g. T194768 so they should stay for now.
I think this was a problem of MathML and is fixed in newer firefox versions.
I think this was a problem of MathML and is fixed in newer firefox versions.
@ovasileva I just created a pdf of https://de.wikipedia.org/wiki/Satz_des_Pythagoras and half of it is not shown at all and the other half still looks horrible:
The caracters probably still look bad, but there should not be any errors anymore since png is also using MathJax now
the rendering of Malayalam is still not very good, but there is no error anymore, since the png is created from the svg images now
should be resolved since png is also using MathJax now
Nov 9 2018
Nov 7 2018
@Physikerwelt For example in the mobile view:
Nov 5 2018
The custom "Wikipedia syntax" suggested here would be a terrible idea. We are having enough trouble with non-standard LaTeX syntax already. At times, especially with mhchem or optional arguments (which use square brackets) it seems random, how an equation renders and if not what kind of error message you get. Even as an experienced mediawiki-user you can spend hours trying to find out where those strange error messages come from and how to fix them.
Nov 2 2018
What is currently being done is delivering the same static svg images to everyone. What I propose is to do the final rendering client-side like in normal MathJax, so that e.g.
@Pkra About inlining SVG: Maybe you are right and in principle SVG could become a better solution than HTML. I don't like the idea of taking the current system and just removing the image tags because this is not enough to solve those problems. The most probable case is that we don't have enough manpower to actually do the things that "might be doable" and "require a bit of work" and the result is that we are stuck with an over-customized, substandard, unmaintained system forever.
Nov 1 2018
@SalixAlba Thank you for taking care of the botflag on enwiki, feel free to take responsibility and request a botflag on any other project you like.
Oct 31 2018
Oct 30 2018
@Theklan: We are currently discussing some changes to the math extension here: T195861, you are welcome to discuss this issue. Currently the plan is to
- get a correct rendering like in normal MathJax / LaTeX for all equations, especially mhchem
- render non-ASCII-characters like Cyrillic letters, äöü etc. properly in all browsers such that \text can be used with all languages that need special characters
As you can see this is a tremendous amount of work, progress is quite slow and everything relies on volunteers. I don't think changing the syntax for existing formulas with commas is feasible and maintaining the current system without additional localization features already overstretches our resources. I think it would be a better idea to just keep using the well-known {,} for decimal comma and make sure that help pages mention this LaTeX hack.
Aug 19 2018
@SalixAlba Thank you for finding the problem with the unmatched math tags and fixing it, also @Framawiki thanks for your pull request.
Aug 7 2018
I think one has to differentiate: Of course there are large formulas which cannot be broken and need some scrolling mechanism. Most formulas however have a structure like <math>A=B+C+D</math> which you can break almost everywhere if necessary. Wikipedia authors currently have a choice between
- Adding permanent linebreaks or splitting the expression up in smaller chunks which means making it less readable by wasting space on large screens and/or introducing unnecessary artificial names for parts of the sum
- Not adding any linebreaks, making the expression unreadable on small screens or introducing unnecessary scrolling or zooming.
- Adding linebreaks like
First sentence
:<math>A</math><math>\;=B</math><math>\;+C</math>{{nowrap|<math>\;+D</math>.}}
Next sentence.to force a correct linebreaking behaviour and an equal looking punctuation mark.
Aug 4 2018
Yes, definitely. This would also solve part of T194768. With JS and Mathjax you can get this functionality easily, however in Wikipedia it seems to be a huge problem because most people get to see the math as images rather than text. We are trying to come up with a solution and would need some input from developers, see T195861.
Jul 25 2018
The goal of the project would be to verify that every mathematical formula uses proper LaTeX syntax. Because LaTeX is based on macros, this can be quite complicated and the only way to be 100% sure is to use LaTeX and render it. For example MathJax treats \overline and similar primitives like normal macros, thus it is sometimes more tolerant than other rendering engines.
Turns out: There is no problem with setting the botflag. I was expecting a fat B to show up in the version history, but this is only shown in recent changes
Jul 11 2018
The bot has bot-rights on dewiki, but saving with botflag=True doesn't set the botflag. Any idea?
Jul 1 2018
Thanks for fix with the negative lookahead.
Jun 30 2018
Thanks for finding the problem with the login. The problem with the missing \and replacement could be that [^\\] doesn't match at the beginning of the string when there is no character to match.
@SalixAlba It seems the problem with the login came with the try catch block, maybe it goes away when we remove it. It seems like the bot is trying to login when it is already logged in.
Jun 27 2018
Jun 24 2018
@SalixAlba I created a version of mathwikibot.py that should ensure that the same part of the file is being replaced. It is not extensively tested, I created a "finditer" branch so that we can keep using the master branch in case it doesn't work.
@TheDJ Thanks for the requirement list. Am I right that those are user-convenience features we "should" meet and not "hard" requirements breaking the MediaWiki software? I have the impression that lazy and asynchronous loading might conflict with the requirement to not reflow the page.
Jun 22 2018
@Pkra DavidEppstein already said the most important issues:
- Properly render things like <math>\text{это хорошо выглядит?}</math>, <math>\text{für alle}</math> to have the same quality and appearance of the text outside
- The ability to copy-paste parts of an equation
Some other things I would add to the list:
- Linebreaking for small screens, snippet-previews etc. (some people put linebreaks like <math>a</math>{{nowrap|<math>\;=b</math>.}} )
- A referencing system with automatic numbers aligned left or right
Except for the copy-paste problem I see them as blocking issues which we need to have solved before we can address the "create a viable distinction of inline and block formulae" problem, because otherwise you need the capability of composing one block-formula with several math tags and templates and the :<math> markup which in turn produces invalid html, some more rendering issues and also problems for unexperienced editors that try to edit formulas with the VE.
@all: Concerning the output:
@SalixAlba Thanks for identifying and fixing potential problems. I have no idea why it doesn't replace \bold there. I do not know an easy solution that would guarantee that the last replace matches the same part of the file, but so far it did not cause any problems.
@Physikerwelt For the simple substitutions we are doing at the moment, the source code of the images should be identical, so we could do a comparison to make sure we don't do any replacement texvcjs doesn't do:
Jun 20 2018
@Physikerwelt In general I would agree, after all the whole reason for the replacement is that texvcjs is not handling LaTeX properly. We will not be able to get rid of all errors, no matter if we use a proper LaTeX parser on the broken texvc syntax or the texvc parser outputting broken LaTeX syntax.
Jun 17 2018
Thank you, I can push to it now, but only from my home computer and I also can't edit the access rights and add @Physikerwelt and @SalixAlba
@Physikerwelt: Yes, we can, however we should remove the file with the password and the file with the bot password from the public repo. There is already a git repository attached to the project, however I did not manage to push anything to it, it seems like I was able to create it, but only repository-admins can change and give permissions to push to it and I am not in that group.
@SalixAlba @Physikerwelt I added some more safeguards and I think we are ready to request bot flags. I would suggest to start on en.wikipedia with the pages https://en.wikipedia.org/wiki/User:Texvc2LaTeXBot/enwiki (inputlist_enwiki.txt)
Jun 16 2018
Jun 14 2018
@SalixAlba @Physikerwelt I created a project "texbot" on toolforge with an account "Texvc2LaTeXBot" and a pywikibot script "mathwikibot.py" that can handle the replacements and added you as maintainers (I did not find a tool account for the other members of our commission, otherwise I am happy to add them as well). Now still need a bot-flag, some documentation and agree on the list of pages to be processed. For the start I would suggest to only run it on pages containing unescaped $ % and \or and just on one project (en.wikipedia?) but do all replacements in the list since we have to edit the article anyway.
Jun 13 2018
@DavidEppstein I don't use the templates, so just edit the task description.
@SalixAlba You are right, I counted the number of records which somehow doesn't match the number of equations. Summing up the original output
awk 'BEGIN{num=0; }; /total\smath/{num += $3;}; END{print num;};' output/*stats.txt only gives 6672223 <math> and <math chem> and 15218 <chem> or <ce> tags.
@SalixAlba That is nice, but are you sure about the numbers? My last run counted 3,769,060,639 math and chem equations across all projects except wikidata.
Jun 11 2018
@Tacsipacsi At the moment we are actually discussing changes to the syntax in T195861, it would be nice if you would join or commission. From my perspective the most important part would be that national characters like ü in \text{für alle} which can be entered, also get an acceptable rendering in all browsers.
Jun 10 2018
@Physikerwelt That is nice, I did not know about the package. I guess that means we only have to replace $ % and \or ?
@all I have written down the "Quick and dirty syntax update strategy" I would suggest: https://www.mediawiki.org/wiki/Extension:Math/Roadmap. I deleted most other things that were on the page before, please don't hesitate to re-add things or change it.
Jun 9 2018
@SalixAlba I think that is a good idea. But we should not assume that in every project there are people with knowledge of LaTeX monitoring the error categories and happy to do this work. While it would be feasible to do all those replacements manually, it is such a simple task that we should do it with a global bot.
Jun 8 2018
Lists with improved wikilinks for the problematic commands:
- https://www.mediawiki.org/wiki/User:Debenben/and 9537
- https://www.mediawiki.org/wiki/User:Debenben/C 3779
- https://www.mediawiki.org/wiki/User:Debenben/dollar 2280
- https://www.mediawiki.org/wiki/User:Debenben/H 603
- https://www.mediawiki.org/wiki/User:Debenben/or 4197
- https://www.mediawiki.org/wiki/User:Debenben/pagecolor 701
- https://www.mediawiki.org/wiki/User:Debenben/part 2893
- https://www.mediawiki.org/wiki/User:Debenben/percent 4245
depending on the regex, it only takes around 5 to 30 seconds to create them, so feel free to request them for any pattern you want. For \ce I found 17144 occurrences, but I found a bug in my extraction script cutting off <chem> patterns containing >, so I am now running the script to extract from the dumps again.
Jun 7 2018
@DavidEppstein Yes, that is quite easy now. I thought of listing them like https://www.mediawiki.org/wiki/User:Debenben/pagecolor. What do you think? I did not run it on the other problematic commands yet because I wanted to make the wikilinks point to the correct pages. I might be able to do that tomorrow, then we also have complete lists for all problematic commands.
I have put together a header for LaTeX that can handle most of the texvc macros: https://www.mediawiki.org/wiki/Extension:Math/TeX-header I would suggest to get rid of all everything in the "problematic texvc definitions" paragraph
@Physikerwelt that is very strange. For the same input I get those error messages on MediaWiki, see https://www.mediawiki.org/wiki/User:Debenben/test
Jun 6 2018
@mhchem The impression I get when running awk '/\\ce/{print $0;}' *_math.txt on my files is that something like 90% will render correctly, 9% we can repair before switching and 1% we have to clean up afterwards. However I can only tell for sure if we had a regression test like proposed above working.
