Sorted wikitables do not properly handle minus signs
Closed, ResolvedPublic

Assigned To
None
Priority
Normal
Author
bzimport
Subscribers
wikibugs-l
Projects
Reference
bz21946
Description

Author: ozob1337

Description:
Sorted wikitables of type currency do not recognize minus signs. In wikibits.js, ts_currencyToSortKey could be changed from

return ts_parseFloat(s.replace(/[^0-9.,]/g,''));

to

return ts_parseFloat(s.replace(/[^-0-9.,]/g,''));

and this would half-fix the problem. But it does not fully fix the problem, because this recognizes the hyphen, -, but not the HTML minus sign, −. Columns of type numeric do not recognize minus signs, either. An example of the latter bug can be viewed at:

http://en.wikipedia.org/w/index.php?title=Wikipedia:Arbitration_Committee_Elections_December_2009&diff=prev&oldid=332579916 (Broken sort using minus signs)
ttp://en.wikipedia.org/w/index.php?title=Wikipedia:Arbitration_Committee_Elections_December_2009&diff=next&oldid=332579916 (Working sort using hyphens)

Sorting on minus signs in columns of type numeric could be fixed by going to ts_parseFloat and changing

num = parseFloat(s.replace(/,/g, ""));

to

num = parseFloat(s.replace(/,/g, "")).replace(/−/gi, "-").replace(/&(?:minus|#x0*2212|#0*8722);/gi, "-")

which would convert HTML minus signs to hyphens before attempting to parse the number; but this would not handle minus signs in currency values, because they would be removed by ts_currencyToSortKey before ts_parseFloat is called.

A more comprehensive solution to this is to substitute characters for entity references in ts_resortTable before the preprocessor is called (or maybe even before the preprocessor is chosen). To fix the bugs with minus signs it would suffice to convert minus sign references as above, but it may be desirable to convert all entity references.


Version: unspecified
Severity: minor

bzimport added a project: MediaWiki-Parser.Via ConduitNov 21 2014, 10:47 PM
bzimport added a subscriber: wikibugs-l.
bzimport set Reference to bz21946.
bzimport created this task.Via LegacyDec 24 2009, 8:47 PM
bzimport added a comment.Via ConduitDec 24 2009, 10:23 PM

conrad.irwin wrote:

patch against 60371

  • allows U+2212 (MINUS SIGN) in place of - in numbers (but not dates).
  • allows a space between the minus sign and the number.
  • allows a minus sign in a currency (before or after the initial currency marker)
  • sorts non numerics in number columns as -Infinity instead of 0 (I assume that all at one end was the intention)

Attached: PATCH.javascript_sort

bzimport added a comment.Via ConduitDec 24 2009, 11:03 PM

ayg wrote:

Looks good. Committed as r60376, thanks.

bzimport added a comment.Via ConduitDec 27 2009, 6:47 AM

M8R-cyc3n3 wrote:

(In reply to comment #1)

Created an attachment (id=6901) [details]
patch against 60371

  • allows U+2212 (MINUS SIGN) in place of - in numbers (but not dates).
  • allows a space between the minus sign and the number.
  • allows a minus sign in a currency (before or after the initial currency marker)
  • sorts non numerics in number columns as -Infinity instead of 0 (I assume that all at one end was the intention)

This patch uses [+-\u2212] which will match anything from U+002B PLUS SIGN to
U+2212 MINUS SIGN. Need to escape the hyphen as it is no longer adjacent to the
brackets. Best practice would be to include the backslash regardless. Escaping
the plus sign to avoid confusion would not hurt either, thus [\+\-\u2212].

Tony1 added a comment.Via ConduitDec 27 2009, 9:00 AM

I tried adding U+2212 to a two-digit numeral in a table: doesn't work.

I must say I strongly support Ozob's filing of this bug; I do hope we can find a way to use the proper symbols in tables.

bzimport added a comment.Via ConduitDec 27 2009, 3:48 PM

ayg wrote:

(In reply to comment #3)

This patch uses [+-\u2212] which will match anything from U+002B PLUS SIGN to
U+2212 MINUS SIGN. Need to escape the hyphen as it is no longer adjacent to the
brackets. Best practice would be to include the backslash regardless. Escaping
the plus sign to avoid confusion would not hurt either, thus [\+\-\u2212].

Good catch. Regex is fun. Fixed in r60430.

(In reply to comment #4)

I tried adding U+2212 to a two-digit numeral in a table: doesn't work.

The fix has been committed to trunk. It isn't live on Wikipedia yet, that will happen who knows when. Note that there's currently no reliable way of telling what revision Wikipedia is at without poking through SVN logs. It's currently at r57447, I think, and has been since early October.

Tony1 added a comment.Via ConduitDec 27 2009, 3:51 PM

What, three thousand "revisions" behind? To a tech-moron like me, it sounds strange. But I believe you. Let's hope they do a thousand in a stroke. Thanks.

bzimport added a comment.Via ConduitDec 27 2009, 9:51 PM

happy.melon.wiki wrote:

Nope, that's standard practice, especially with the tech team being so short-staffed at this time. Scaps are usually several thousand revisions at a time.

bzimport added a comment.Via ConduitDec 27 2009, 10:02 PM

ayg wrote:

They didn't used to be. A year or two ago we had scaps every week or so. Hopefully we'll return to those halcyon days in the imminent future, but until then we are where we are.

bzimport added a comment.Via ConduitDec 28 2009, 4:46 PM

M8R-cyc3n3 wrote:

Note that there's currently no reliable way of telling what revision
Wikipedia is at without poking through SVN logs. It's currently at
r57447, I think, and has been since early October.

[[Special:Version]] says r59858. Is that not reliable?

bzimport added a comment.Via ConduitDec 28 2009, 4:48 PM

happy.melon.wiki wrote:

Nope. :-D

Tony1 added a comment.Via ConduitDec 28 2009, 4:52 PM

It should be tagged as such, then. I'm all for common WPs knowing just a little of the big picture, the basics, of the techie side.

Add Comment

Column Prototype
This is a very early prototype of a persistent column. It is not expected to work yet, and leaving it open will activate other new features which will break things. Press "\" (backslash) on your keyboard to close it now.