Page MenuHomePhabricator

padright: and similar functions fail with non-ASCII arguments
Closed, ResolvedPublic

Description

Author: alon

Description:
When given a non-ASCII filler, padright: and its kin apply the right number of
an incorrect element (see [[:meta:User:Taragui#Padright test]] for an example.

Handling of Unicode seems broken in any case: non-ASCII characters in the first
argument are counted '''according to their byte length''' (i.e., as 2 to 4
characters) instead of as one each, as they should. This breaks the fix for the
unavailability of a <code>strlen</code> function proposed at
[[:meta:Talk:ParserFunctions#strlen & substr]].


Version: unspecified
Severity: normal

Details

Reference
bz8604

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:33 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz8604.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

This doesn't break that fix, because that probably would have done byte count too. Note that
Unicode characters can also be visually less than one character, e.g., combining or zero-width
characters (although admittedly those are rarer).

Language::pad use strlen() whereas we should use mb_strlen()
(and code a function if mbstring is not loaded)

Fixed the pad functions in r37567. Were there any others?