Add support for non-Arabic number systems
OpenPublic

Description

We support display of non-Arabic number systems. We should add support to parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in Bug 31371) will work.

Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well.

https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox


Version: unspecified
Severity: enhancement

bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz34193.
MarkAHershberger created this task.Via LegacyFeb 3 2012, 5:55 PM
bzimport added a comment.Via ConduitFeb 3 2012, 6:44 PM

psneog wrote:

*** This bug has been marked as a duplicate of bug 31371 ***

bzimport added a comment.Via ConduitFeb 3 2012, 6:49 PM

psneog wrote:

let's live with this bug for sometime. I'll work on it eventually.

MarkAHershberger added a comment.Via ConduitFeb 3 2012, 6:57 PM

This is not a duplicate of Bug 31371 -- it is a request to add functionality to MW that MW does not currently have. Bug 31371 was a request to disable numeric conversion on aswiki.

bzimport added a comment.Via ConduitFeb 3 2012, 7:22 PM

psneog wrote:

let's live with this bug for sometime. I'll work on it eventually.

*** This bug has been marked as a duplicate of bug 31371 ***

bzimport added a comment.Via ConduitFeb 4 2012, 2:14 AM

shijualex wrote:

As already mentioned by Mark A. Hershberger in Comment 3, this bug is not a duplicate of bug 31371.

And adding this functionality to MediaWiki is not only for Assamese or Bengali language. There are many other languages in this world using non-Arabic numerals. In India itself among Wikipedias atleast Kannada, Gujarati, Oriya, Punjabi, and Marathi are using their own native language scripts. I am sure there are many other languages out side India which require similar support.

Prabhakar, for the sake of pushing your POV please do not close the bugs. Mark has created this bug to add a major functionality to Mediawiki. And that functionality is very much required.

jayantanth added a comment.Via ConduitFeb 4 2012, 4:50 AM

(In reply to comment #0)

We support display of non-Arabic number systems. We should add support to
parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in
Bug 31371) will work.

Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well.

https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox

for this {{#expr: {{CURRENTYEAR}} + 10}}
we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric number will not work like that.

bzimport added a comment.Via ConduitFeb 5 2012, 10:17 AM

wikichaipau wrote:

*** Bug 34174 has been marked as a duplicate of this bug. ***

bzimport added a comment.Via ConduitFeb 5 2012, 12:10 PM

psneog wrote:

Okey Shiju; I am sorry. However, this is not a POV comment. I am a good speaker and reader, well cultured in both Assamese and Bengali and I feel as if both are my mothers. Because of that only my post read like a POV comment.

MarkAHershberger added a comment.Via ConduitFeb 6 2012, 4:46 AM

(In reply to comment #6)

for this {{#expr: {{CURRENTYEAR}} + 10}}
we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

This is helpful, thanks!

And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric
number will not work like that.

Since we only deal with base10 systems (AFAICT), this is just a small matter of programming. We have the en->bn mapping for numbers stored in a file:

https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/MessagesBn.php?revision=110165&view=markup

So changing "২ + ৩" into something that parserfunctions can understand ("2 + 3" in this case) is fairly simple. Here is an example of some code I just came up with:

$revert = array_flip($digitTransformTable);  # $revert now contains the
                                             # flipped numeral translation
                                             # from MessagesBn.php
$bn = "২ + ৩";
$bnTR = strtr($bn, $revert);
echo "$bnTR\n";
echo eval("return $bnTR;"), "\n";

This code prints the following:

2 + 3
5

In fact, poking around a bit, most of this functionality is built into MW and just not used in ParserFunctions.

I'll attach a patch to fix this bug for ParserFunctions {{#expr}}, but, after applying it on my local wiki where $wgLanguageCode = "bn", I can create a page with the following:

{{#expr: ২ + ৩}} <br>
{{#expr: {{CURRENTYEAR}} + 10 + + ৩}}<br>
{{CURRENTYEAR}}

And it will display:

৫
২০২৫
২০১২

I *think* this would solve a great deal of the problem.

MarkAHershberger added a comment.Via ConduitFeb 6 2012, 4:51 AM

Created attachment 9958
Allow #expr to use non-arabic numerals

We're in the middle of a code slush and this needs review, but it is a start.

attachment ParserFunctions.diff ignored as obsolete

MarkAHershberger added a comment.Via ConduitFeb 6 2012, 5:23 AM

I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me know what you think.

bzimport added a comment.Via ConduitFeb 6 2012, 11:43 AM

wikichaipau wrote:

(In reply to comment #11)

I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me
know what you think.

That seems to work! But I want to be sure---you are seeking a solution for all languages and not only for Bengali, right?

MarkAHershberger added a comment.Via ConduitFeb 6 2012, 11:10 PM

(In reply to comment #12)

That seems to work! But I want to be sure---you are seeking a solution for all
languages and not only for Bengali, right?

The code there will work for any language that MW has a numeral mapping for. This includes Arabic, several Indic languages and more.

MarkAHershberger added a comment.Via ConduitFeb 7 2012, 3:23 PM

Updated patch and testwiki with code for #time

MarkAHershberger added a comment.Via ConduitFeb 7 2012, 3:24 PM

Created attachment 9964
Allow #expr and #time to use non-latin numerals

update based on test wiki use

Attached: ParserFunctions.diff

MarkAHershberger added a comment.Via ConduitFeb 7 2012, 3:25 PM

note some use of the test wiki depends on templates that aren't there. I'll add those later today.

MarkAHershberger added a comment.Via ConduitFeb 7 2012, 11:31 PM

Added the templates. All that is needed now is a reverse conversion of month names.

jayantanth added a comment.Via ConduitFeb 8 2012, 4:42 AM

Yes non-Arabic number works fine , so need to fix Bug 19412

MarkAHershberger added a comment.Via ConduitFeb 13 2012, 5:19 PM
  • Bug 34335 has been marked as a duplicate of this bug. ***
jayantanth added a comment.Via ConduitFeb 14 2012, 10:32 AM

at http://winkyfrown.com/wiki/

{{#time:Y F j|{{{1|{{CURRENTYEAR}}}}}-{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}}}} output shown ২০১২ ফেব্রুয়ারি ১৪(today 2012 February 14) (OK)

But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2 days}}=Error:Invalid time (not OK).

If First one is working fine, why next one is not working?

And others Error as shown in main page.

MarkAHershberger added a comment.Via ConduitFeb 14 2012, 4:20 PM

(In reply to comment #20)

But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2
days}}=Error:Invalid time (not OK).

If First one is working fine, why next one is not working?

If you look at the docs https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23time you'll see that this is expected behavior and why

MarkAHershberger added a comment.Via ConduitFeb 14 2012, 9:18 PM

I've updated the code on my testwiki to parse month names .... w000! Now, safesubst

MarkAHershberger added a comment.Via ConduitFeb 14 2012, 9:32 PM

Created attachment 10008
Month name parsing for core

Attached: month-names.diff

Yamaha5 added a comment.Via ConduitMar 1 2012, 12:27 PM

after updating to 1.19 it doesn't have any effect on fa.wiki ! specially {{#expre ۱+۱}} shows error

MarkAHershberger added a comment.Via ConduitMar 1 2012, 8:38 PM

(In reply to comment #26)

after updating to 1.19 it doesn't have any effect on fa.wiki ! specially
{{#expre ۱+۱}} shows error

Yes, sorry. It was too late to make it into 1.19.

Amire80 added a comment.Via ConduitMar 24 2012, 8:34 AM

The patch looks good to me, in the sense that it doesn't

Amire80 added a comment.Via ConduitMar 24 2012, 9:17 AM

[The previous comment was submitted by mistake in a very funny way.]

The patch looks good to me, in the sense that it doesn't seem to break anything major. I also tested in Devanagari on my local wiki and it worked. A test page in the live Hindi Wikipedia: https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .

However:

  1. (Probably) Small problem: It always runs parseFormattedNumber, even when that is not needed. Maybe it should run it only on wikis that use such numbers.
  2. (Somewhat) Larger problem: This is correct if the requirement is to always present the result in the native numbers and if it is considered OK to mix native and non-native numbers. This may be fine, but is there a specification somewhere or is it just a random decision?
Bawolff added a comment.Via ConduitMar 24 2012, 11:06 PM

Note this is a dupe of bug 30318 which is marked wontfixed. I personally think this will break stuff (The expr part. I have no opinion on the date stuff). It will cause expressions to be interpreted totally differently depending on language. Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1 depending on wiki language? It would prevent people from copying templates from different languages.

(Note there is a work around of doing {{#expr: {{FORMATNUM:{{CURRENTYEAR}}|R}} + 10}} for the example use case presented in comment 0)

Bawolff added a comment.Via ConduitMar 24 2012, 11:09 PM
  • Bug 32807 has been marked as a duplicate of this bug. ***
MarkAHershberger added a comment.Via ConduitMar 25 2012, 12:03 AM

(In reply to comment #30)

Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1
depending on wiki language? It would prevent people from copying templates from
different languages.

I don't understand how it would prevent copying templates, unless the templates deal with formatting numbers -- and this change would make formatting numbers clearly dependent on the language of the wiki.

I think this is most clearly useful for wikis that target people in India (http://en.wikipedia.org/wiki/Indian_numbering_system). Since that is an area the WMF is targeting, I think that should be considered.

We target language users by providing wikis in their language so that they're comfortable using them, but, then, when it comes to a basic part of their interaction with the wiki -- numbers and dates -- we require that they adapt themselves to Western conventions.

Sophisticated users are probably fine with the current situation, but from the brief look I've had at discussions on hiwiki and bnwiki, there are a significant number of people there who would like something they feel more comfortable with.

(In reply to comment #29)

  1. (Somewhat) Larger problem: This is correct if the requirement is to always present the result in the native numbers and if it is considered OK to mix native and non-native numbers. This may be fine, but is there a specification somewhere or is it just a random decision?

Agreed, there should be broader discussion about this.

Bawolff added a comment.Via ConduitMar 25 2012, 1:36 AM

I don't understand how it would prevent copying templates, unless the templates
deal with formatting numbers -- and this change would make formatting numbers
clearly dependent on the language of the wiki.

Well all decimal numbers are "formatted". Some templates need constants in them.

Example: Put this patch on your test wiki. Set wiki language to nl, import [[template:Precision]] from en.wikipedia, watch it break (It gives wrong answers for non-formatted, and gives errors for formatted).

We target language users by providing wikis in their language so that they're
comfortable using them, but, then, when it comes to a basic part of their
interaction with the wiki -- numbers and dates -- we require that they adapt
themselves to Western conventions.

I wouldn't call using {{#expr a basic part of wiki-editor. Its very easy to create templates using #expr that read formatted numbers so that the average user doesn't have to deal with it. Yes it would be nice if it all magically worked, but i'm worried this introduces further problem. (I suppose one could call {{Formatnum:...}} on every constant in a template, and hence this would just shift responsibility for who has to call formatnum)

Bennylin added a comment.Via ConduitMar 25 2012, 7:56 AM

Would the scope of this bug also include [[Chinese numerals]]?

Bawolff added a comment.Via ConduitMar 25 2012, 4:59 PM

(In reply to comment #34)

Would the scope of this bug also include [[Chinese numerals]]?

At the moment the chinese language files are set to use plain old 0123456789 type numerals. I believe this bug is more about supporting just whatever the default formatted number output for a language is, which does not include [[Chinese numerals]].

jayantanth added a comment.Via ConduitMar 25 2012, 5:58 PM

Ping, Bug 19412 must fixed with corresponding this bug.

MarkAHershberger added a comment.Via ConduitMar 25 2012, 7:06 PM

(In reply to comment #34)

Would the scope of this bug also include [[Chinese numerals]]?

This would be a start to supporting [[Chinese numerals]], but AFAICT,
[[Indian numerals]] are more straight-forward.

That is, I don't think the scope of this bug would cover [[Chinese numerals]], but a bug to support [[Chinese numerals]] would imply that the more straight-forward [[Indian numerals]] are already supported.

I say this because the representation for 12,345,678,902,345 from
[[Chinese numerals]] is not a one-to-one mapping. Instead it is
十二兆三千四百五十六億七千八百九十萬兩千三百四十五 where in Devanagari it would be
१२३४५६७८९०२३४५

Siddhartha-Ghai added a comment.Via ConduitMar 25 2012, 7:47 PM

(In reply to comment #29)

[The previous comment was submitted by mistake in a very funny way.]

The patch looks good to me, in the sense that it doesn't seem to break anything
major. I also tested in Devanagari on my local wiki and it worked. A test page
in the live Hindi Wikipedia:
https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .

However:

  1. (Probably) Small problem: It always runs parseFormattedNumber, even when that is not needed. Maybe it should run it only on wikis that use such numbers.
  2. (Somewhat) Larger problem: This is correct if the requirement is to always present the result in the native numbers and if it is considered OK to mix native and non-native numbers. This may be fine, but is there a specification somewhere or is it just a random decision?

Note about the hi-wp subpage that it is set up assuming hi-wp uses devanagari numerals by default. However, hi-wp has default numerals set to arabic numerals, hence it is half incorrect. A subpage on hi-wikt for the response test is [[:w:hi:wikt:User:Siddhartha Ghai/native numbers]]. You'll find that using arabic numerals gives arabic numerals, using devanagari numerals or arabic-devanagari mixed gives errors.

Siddhartha-Ghai added a comment.Via ConduitMar 25 2012, 7:51 PM

Note:hi-wikt uses devanagari numerals as default.

MarkAHershberger added a comment.Via ConduitMar 25 2012, 8:43 PM

(In reply to comment #39)

Note:hi-wikt uses devanagari numerals as default.

But devanagari numerals don't work with parser functions which (part of) what this bug is about.

That is {{#expr: १ + १}} returns an error, though the output of
{{#expr: 1 + 1}} would be 2.

I've set up a test on
http://hi.wiktionary.org/wiki/User:MarkAHershberger/bug34193

Siddhartha-Ghai added a comment.Via ConduitMar 27 2012, 3:44 AM

(In reply to comment #40)

But devanagari numerals don't work with parser functions which (part of) what
this bug is about.

I know. Just wanted to clarify that a test on hi-wp does not provide the correct picture (which can be seen at hi-wikt).

Just to add, its really important from a user perspective to have the ability to use parser functions using non-arabic numerals.

SPQRobin added a comment.Via ConduitDec 19 2012, 8:03 PM

This doesn't seem like a tracking bug, so removing bug 2007 as depending on this one (and updating title and removing "tracking" keyword).

MarkAHershberger added a comment.Via ConduitAug 28 2013, 1:21 PM

See https://en.wikipedia.org/wiki/User_talk:MarkAHershberger#bn:Template:Convert where [[User:Johnuniq]] indicates that he is doing Lua work on this bug. Maybe making this superfluous?

Certainly, it seems on-wiki control is better (in some sense) than this patch.

Aklapper edited projects, added MW-1.20-release; removed MW-extension-1.20-version.Via WebDec 19 2014, 8:19 PM

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.