Many character sets don't work in texvc
OpenPublic

Description

The page (recent as today, 22:18 CET) contains a Math with the above error.
The editor said he changed the math several times and got different errors,
mostly probably syntax errors. This error message however complaining about
not syntax problems but installation ones, which is very weird.

jeronim_ checked and said:

22:20:34 <@jeronim_> i dunno what that math error is about
22:20:45 <@jeronim_> it's not just yongle, it's other machines
22:21:35 < grin> jeronim_: editor said he changed a word in the math
22:21:49 < grin> jeronim_: and the error come up. maybe page history shows it,
I try to check
22:22:03 <@jeronim_> sorry i can't really help beyond just seeing if the right
software is installed
22:22:11 <@jeronim_> you'd have to ask someone else


Version: unspecified
Severity: normal
URL: https://meta.wikimedia.org/wiki/Help:Displaying_a_formula?oldid=3698791#Rendering
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=36496
https://bugzilla.wikimedia.org/show_bug.cgi?id=48032

bzimport added projects: Math, I18n.Via ConduitNov 21 2014, 7:02 PM
bzimport set Reference to bz798.
grin created this task.Via LegacyOct 28 2004, 8:29 PM
grin added a comment.Via ConduitOct 29 2004, 12:37 PM

buggy math moved to talk page until fixed.

grin added a comment.Via ConduitOct 29 2004, 1:09 PM

Bug:
*<math> \mbox{pá} - </math> - bad
*<math> \mbox{pa} - </math> - good

Seems to be something messed up about accented/utf8 chars and minus sign.

bzimport added a comment.Via ConduitMar 26 2005, 10:37 AM

foenyx wrote:

*** Bug 1759 has been marked as a duplicate of this bug. ***

brion added a comment.Via ConduitApr 1 2005, 8:38 PM
  • Bug 1799 has been marked as a duplicate of this bug. ***
brion added a comment.Via ConduitOct 20 2005, 2:32 AM
  • Bug 3752 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitOct 20 2005, 8:02 PM

bggoldbg wrote:

IMHO bug 3752 is not wrong but missing functionality in TeX. For example I do
not know whether the Knuth's font does have cyrillic letters at all. Thus I
consider this as a request for enhancement.

brion added a comment.Via ConduitDec 6 2005, 11:12 PM
  • Bug 4199 has been marked as a duplicate of this bug. ***
brion added a comment.Via ConduitJan 8 2006, 10:06 PM
  • Bug 4533 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitJan 8 2006, 10:18 PM

jan.kraljic wrote:

Is there any work going on to solve this bug?

brion added a comment.Via ConduitJan 8 2006, 11:19 PM

There's not really anyone who's familiar with the TeX stuff who's been
active in the last couple years.

bzimport added a comment.Via ConduitJan 18 2006, 9:12 PM

angus wrote:

The reason of this problem could be simply that TeX is trying to load the
package ucs.sty and dies when it does not find it. If that's the case, you
should either run

  1. apt-get install latex-ucs

or apply the attached patch. (But installing the package is better because it
covers a large unicode range.)

Index: texutil.ml

RCS file: /cvsroot/wikipedia/phase3/math/texutil.ml,v
retrieving revision 1.12
diff -u -r1.12 texutil.ml

  • texutil.ml 12 Jan 2006 20:38:31 -0000 1.12

+++ texutil.ml 18 Jan 2006 21:05:37 -0000
@@ -44,7 +44,7 @@
let tex_mod_reset () = (modules_ams := false; modules_nonascii := false;
modules_encoding := UTF8; modules_color := false)

let get_encoding = function

  • UTF8 -> "\\usepackage{ucs}\n\\usepackage[utf8]{inputenc}\n"

+ UTF8 -> "\\usepackage[utf8]{inputenc}\n"

| LATIN1 -> "\\usepackage[latin1]{inputenc}\n"
| LATIN2 -> "\\usepackage[latin2]{inputenc}\n"
bzimport added a comment.Via ConduitJan 22 2006, 12:10 PM

max wrote:

The problem with the original LaTeX is that you have to switch font encodings
manually. E.g. for an English/Russian/Polish/Greek text three encodings should
be used: latin (T1), cyrillic (T2A) and greek (LGR). Something like this:

\documentclass{article}
\usepackage[utf8x]{inputenc}
\usepackage[T2A,LGR,T1]{fontenc} % The last encoding is default
\newcommand\cyr[1]{\bgroup\fontencoding{T2A}\selectfont #1\egroup}
\newcommand\grk[1]{\bgroup\fontencoding{LGR}\selectfont #1\egroup}
\pagestyle{empty}
\begin{document}
$$ a=b\quad\mbox{if/\cyr{если}/jeśli/\grk{εἰ}}\quad c=d $$
\end{document}

It works, but quite ugly. And I completely don't know how to deal with
right-to-left scripts and CJK.

bzimport added a comment.Via ConduitFeb 10 2006, 6:21 PM

branko.kokanovic wrote:

adds additional custom preamble to TeX code through texvc arguments

There's new variable that should be set to anything that one wants to be
appended to TeX preamble. Example:
$wgTeXPreambleAdditional="\usepackage[T2A]{fontenc}\nAnother line in preamble";

Attached: patch.diff

bzimport added a comment.Via ConduitMar 4 2006, 9:55 PM

valentin_st wrote:

Another example:
<math> C = BW \times \log_2 \left( 1+\frac{P_с}{P_ш} \right) </math>
http://bg.wikipedia.org/wiki/Беседа:Пропускателна_способност

brion added a comment.Via ConduitJul 9 2006, 4:44 AM
  • Bug 6596 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitSep 11 2006, 2:05 PM

h-j.luecking wrote:

I get this error when setting $wgUseTeX = true; in localsettings:

Es gab einen Syntaxfehler in der Datenbankabfrage. Die letzte Datenbankabfrage
lautete:

(SQL-Abfrage versteckt)

aus der Funktion „MathRenderer::_recall“. MySQL meldete den Fehler „1267:
Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and
(utf8_general_ci,COERCIBLE) for operation '=' (localhost)“.

Mormegil added a comment.Via ConduitSep 11 2006, 6:56 PM

(In reply to comment #16)

This is not connected with this bug. The error is probably caused by wrong table definition (and the fact that you
use UTF-8 character set): The math_inputhash column in the math table should have explicit binary collation.

bzimport added a comment.Via ConduitDec 1 2006, 3:00 AM

jutiphan wrote:

We are having this problem in Thai Wikipedia. Thai characters do not work properly with Math tags and need some help. Thanks for anyone who can shed the light on
this.

brion added a comment.Via ConduitDec 18 2006, 11:15 PM
  • Bug 8305 has been marked as a duplicate of this bug. ***
brion added a comment.Via ConduitDec 19 2006, 6:28 PM
  • Bug 8316 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitMay 15 2008, 11:22 PM

don-cles wrote:

I added these two lines to page http://eo.wikipedia.org/wiki/Kemia_ekvilibro

:<math>\mbox{rapido de antauxena reakcio} = k_+ {A}^\alpha{B}^\beta \,\!</math>
:<math>\mbox{rapido de inversa reakcio} = k_{-} {S}^\sigma{T}^\tau \,\!</math>

The second line works okay; the first fails, apparently because of the ux combination which it is supposed to convert to ŭ.

bzimport added a comment.Via ConduitJul 24 2009, 10:57 AM

happy.melon.wiki wrote:

(In reply to comment #21)

I added these two lines to page http://eo.wikipedia.org/wiki/Kemia_ekvilibro

:<math>\mbox{rapido de antauxena reakcio} = k_+ {A}^\alpha{B}^\beta \,\!</math>
:<math>\mbox{rapido de inversa reakcio} = k_{-} {S}^\sigma{T}^\tau \,\!</math>

The second line works okay; the first fails, apparently because of the ux
combination which it is supposed to convert to ŭ.

The page now seems to correctly render the ŭ character correctly. The testcases in c2 also display correctly. Assuming FIXED.

grin added a comment.Via ConduitJul 24 2009, 6:37 PM

Fix confirmed.

bzimport added a comment.Via ConduitJul 24 2009, 8:43 PM

ragibhasan wrote:

Are you sure that this has been fixed? I just tried the following formula in :bn:, and it still shows a parse error:

:<math>\mbox{কখগ} = k_+ {A}^\alpha{B}^\beta \,\!</math>

The error message shows: পার্স করতে ব্যর্থ (PNG রূপান্তর ব্যর্থ; latex, dvips, gs, এবং convert ঠিকমত ইন্সটল হয়েছে কি না পরীক্ষা করুন): \mbox{কখগ} = k_+ {A}^\alpha{B}^\beta \,\!

The translation in English is: Failed to parse (Failed to convert to PNG; please check if latex, dvips, gs, and convert are installed correctly)

I also noticed that we cannot use Bengali numerals (in unicode UTF-8) inside latex formulas. That gives us the failure to parse error in bn.wikipedia.

grin added a comment.Via ConduitJul 25 2009, 6:31 AM

Well at least latin script unicode works (latin extended block), but see:

http://en.wikipedia.org/wiki/User:Grin/mathtest

Indeed apart from latin script it still fails.

bzimport added a comment.Via ConduitJul 25 2009, 5:50 PM

fibonacci.prower wrote:

Not even for Latin script. <math>í</math> gets me the following error:
Failed to parse (lexing error): í

It seems that it will only work if the non-ASCII text is inside an mbox.

bzimport added a comment.Via ConduitDec 23 2011, 6:07 PM

sumanah wrote:

Branko, thank you for your patch. I am sorry it's been unreviewed for so long; I am 99% certain that it's been somewhat obsoleted since you wrote it. Is this bug still reproducible? If so, would you be interested in revisiting it?

He7d3r added a comment.Via ConduitDec 23 2011, 8:45 PM

(In reply to comment #27)

Is this bug still reproducible?

Per [[meta:Help:Displaying_a_formula#Rendering]], \mbox{ð} and \mbox{þ} will give an error:

  • Failed to parse (PNG conversion failed; check for correct installation of latex and dvipng (or dvips + gs + convert)): \mbox {ð}
  • Failed to parse (PNG conversion failed; check for correct installation of latex and dvipng (or dvips + gs + convert)): \mbox {þ}

These error messages are still displayed on that metawiki page.

Matanya added a comment.Via ConduitJul 26 2012, 5:50 PM

With MathJax this can be set to resolved. Mathjax site has a list of compatibility here: http://www.mathjax.org/resources/browser-compatibility/

so basically it is supported on all browsers and platforms.

He7d3r added a comment.Via ConduitJul 26 2012, 11:14 PM

The PNG conversion is still failing on WMF wikis (just checked on the documentation page mentioned on comment 28).

Besides, until MathJax is enabled by default (bug 36496), it can not be considered a fix to this bug which still happens on Wikipedia.

TheDJ added a comment.Via ConduitSep 7 2012, 5:01 PM

OK, so the difference between mbox and text seems to have disappeared at some point. The problem is now that not all character sets are supported.

Unfortunately LateX doesn't support full unicode. Perhaps we should consider switching to XeTeX ?
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=xetex
http://en.wikipedia.org/wiki/XeTeX

scfc added a comment.Via ConduitMay 20 2013, 2:45 AM

(In reply to comment #31)

OK, so the difference between mbox and text seems to have disappeared at some
point. The problem is now that not all character sets are supported.

Unfortunately LateX doesn't support full unicode. Perhaps we should consider
switching to XeTeX ?
http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=xetex
http://en.wikipedia.org/wiki/XeTeX

It doesn't make sense to address this before resolving bug 34038 first. It'd probably be relatively easy to convert some Unicode input into something LaTeX renders correctly, but then the initial incentive to use LaTeX as a format -- i. e. freely transfer text between wiki and LaTeX documents -- gets lost completely.

I think the question should be directed the other way, lifted of any past efforts: If a British/French/Bengali/Arabian wiki author wants to enter a formula, what formats a) ease that work and b) are well established? If most authors will use a formula editor, the format choice can be guided mainly by technical considerations. If we expect most formulas to be entered manually by people unfamiliar with TeX, the latter would be an odd choice as its behaviour can be as surprising as MediaWiki's wiki parser and a format that would /define/ a formula instead of being /commands/ to a typesetter would clearly be preferable.

Even a change to use XeTeX should IMHO be reflected by the use of a new tag ("<math-xe>" or something similar), so that we don't cause more headaches than neccessary.

brion added a comment.Via ConduitSep 30 2013, 10:24 AM
  • Bug 54778 has been marked as a duplicate of this bug. ***
bzimport added a comment.Via ConduitSep 30 2013, 10:41 AM

sodabottle wrote:

Hi Brion,

As a temporary fix for Tamil wikipedia (Bug 54778), can MathJax be enabled as default in Ta Wiki alone. I read bug 36496 and it says it wasn't made default in wiki projects because of slow loading time in low-end computers. I tested some math heavy pages in some low end machines (1GB Ram, Win XP) and the time seems acceptable. Currently we face the choice between "fast page load with render as png but with errors" vs "default MathJax".

If I can obtain community consensus is it possible for making MathJax default for Ta Wiki?

bzimport added a comment.Via ConduitSep 30 2013, 4:25 PM

physik wrote:

I think we should wait until Math 2.0 is deployed. This enables the same filtering of the commands sent to MathJaX as those sent to latex. This prevents that the grammar diverges.
Furthermore Frederic Wang did major improvments to the matjax loader, that depend on mathjax 2.3. I would strongly recommend to wait until these changes are merged as well.

yuvipanda added a comment.Via ConduitSep 30 2013, 4:28 PM

@SodaBottle: Can you open up another bug + start the community discussion for it as well? Thanks!

Pkra added a comment.Via ConduitSep 30 2013, 4:39 PM

Just to throw it out there. If the math extension used MathJax on the backend, then it seems a lot of these problems would go away. MathJax's TeX-input is slightly more powerful than texvc, is designed for a web environment and would remove the need for sanitization.

bzimport added a comment.Via ConduitSep 30 2013, 11:37 PM

physik wrote:

@Peter: I think we should not make the same mistake
(to use a not well defined subset of latex extended by some customized macros) again and use the new MathJax language instead of texvc.
I think changing the language of math input that is shared between all languages should be a common process.

Pkra added a comment.Via ConduitOct 1 2013, 5:18 AM

@Moritz I understand your concerns but would argue that MathJax consists of a well defined subset of TeX.

I think changing the language of math input that is shared between all

languages should be a common process.

I don't understand that part of your message :( Was something lost by an accidental edit?

bzimport added a comment.Via ConduitOct 1 2013, 5:54 AM

physik wrote:

I think changing the language of math input that is shared between all

languages should be a common process.
I mean natural languages. At the moment all wiki installations use the same texvc input language like eg. \sen

I just googled for texvc discussion and found some parts:
http://meta.wikimedia.org/wiki/Texvc

Maybe this becomes off-topic... However, it supports my argument that there should be a discussion how the restricted set of input commands should look like. Especially it should not be determined by the technical limitations of the x-Rendering-Software.

Pkra added a comment.Via ConduitOct 1 2013, 5:20 PM

(In reply to comment #40)

I just googled for texvc discussion and found some parts:
http://meta.wikimedia.org/wiki/Texvc

Thanks. That's very interesting.

Maybe this becomes off-topic...

Probably.

However, it supports my argument that there
should be a discussion how the restricted set of input commands should look
like.

I agree with that but...

Especially it should not be determined by the technical limitations of
the x-Rendering-Software.

I find this too idealistic. In reality, there aren't many solutions for math on the web, all of which have with their own limitations and advantages in a MW setting. The critical question is: what direction MW and its community (in particular Wikipedia) want to take mathematical and scientific content. As suggested by WMF, I tried to start a discussion about this on Wikitech-I but not much came out of it. So the answer seems to be: nobody cares.

Which is why I fully agree with you (but it makes me depressed).

Peter.

Pkra added a comment.Via ConduitOct 1 2013, 5:20 PM

(In reply to comment #40)

I just googled for texvc discussion and found some parts:
http://meta.wikimedia.org/wiki/Texvc

Thanks. That's very interesting.

Maybe this becomes off-topic...

Probably.

However, it supports my argument that there
should be a discussion how the restricted set of input commands should look
like.

I agree with that but...

Especially it should not be determined by the technical limitations of
the x-Rendering-Software.

I find this too idealistic. In reality, there aren't many solutions for math on the web, all of which have with their own limitations and advantages in a MW setting. The critical question is: what direction MW and its community (in particular Wikipedia) want to take mathematical and scientific content. As suggested by WMF, I tried to start a discussion about this on Wikitech-I but not much came out of it. So the answer seems to be: nobody cares.

Which is why I fully agree with you (but it makes me depressed).

Peter.

Mattflaschen added a comment.Via ConduitOct 1 2013, 6:22 PM

(In reply to comment #42)

As suggested by WMF, I tried to start a discussion about this on Wikitech-I but
not much came out of it. So the answer seems to be: nobody cares.

Sorry to hijack this bug, but I'll just post once then hopefully it can move back to RFC and/or Wikitech.

I think some feedback from Wikitech (e.g. Flow and issues with certain languages) was helpful, but I agree the Wikitech thread basically finished.

For smaller stuff, the answer is Just Do It, and hash anything out in code review.

For bigger architectural things (what to store in the database [e.g. MathML not TeX]), or having a single way of validating TeX/ANTLR grammar) where you want an answer before coding, it's probably time for an RFC (https://www.mediawiki.org/wiki/Requests_for_comment). We talked about if/when to do this before, but now is probably a good time. Pick a single issue, unless of course one decision clearly implies others, in which case you should include the related ones.

Above are just examples based on past discussions; the RFC can be whatever you think is appropriate. An example past implemented RFC (though probably a bit simpler) is https://www.mediawiki.org/wiki/Requests_for_comment/Reduce_math_rendering_preferences

He7d3r awarded a token.Via WebNov 24 2014, 12:06 PM

Add Comment