Page MenuHomePhabricator

Floats are badly interpreted in SQL when locale is not English
Closed, DeclinedPublic

Description

When:

  • a float is used in a SQL function (e.g. Database::insert()) AND
  • wgShellLocale is not en_US.UTF-8 or C.UTF-8 but a locale with a comma as decimal separator (e.g. fr_FR.UTF-8)

Then the value is truncated to its integer part.


To reproduce it:
It happens that page_propos has a column with the type FLOAT, hence I use it for this example.
Use a test wiki for this example, it changes the database in a way MediaWiki could misfunction after.

  1. Set $wgShellLocale = 'fr_FR.UTF-8'; in LocalSettings.php
  2. Launch maintenance/eval.php with the commands:
$row = [ 'pp_page' => 1, 'pp_propname' => 'test', 'pp_value' => 'test', 'pp_sortkey' => 0.37 ];
$dbw = wfGetDB( DB_MASTER );
$dbw->insert( 'page_props', $row, __METHOD__ );
  1. In the database see that the value for pp_sortkey is 0.
  2. Delete the added line.
  3. Set $wgShellLocale = 'C.UTF-8'; in LocalSettings.php
  4. Re-do step 2
  5. In the database see that the value for pp_sortkey is 0.37.

I wrongly reported it as a PHP bug #69348, but real_escape_string() really wants a string as input parameter, hence a silent conversion float -> string is done with the current locale. An explicit conversion should be done using a fixed locale (C.UTF-8 I guess) or without locale if possible.

Event Timeline

Searching on the Internet, a lot of people have this issue, either casting float to string (like here) or string to float. The best solution I found (the most generic one) is:

$localeinfo = localeconv();
$floatString = str_replace( $localeinfo['mon_decimal_point'], '.', $floatFloat );

Source: https://secure.php.net/manual/fr/function.floatval.php#92563

According to https://secure.php.net/manual/en/language.types.string.php the decimal point is used and there is no mention to other locale-aware parts (e.g. thousand separator), hence the code above is probably the most generic one.

Another solution is number_format where the separators can be specified, but you have to specify the precision, and I’m not sure the precision can be safely determined. It could be used the maximum precision (but perhaps dependent on the plateform) and then the zeros on the right removed, but I’m not sure it is safe and it’s probably slow.

238482n375 added a project: acl*security.
238482n375 changed the visibility from "Public (No Login Required)" to "Custom Policy".
238482n375 subscribed.
This comment was removed by Krinkle.
Aklapper changed the visibility from "Custom Policy" to "Public (No Login Required)".
Aklapper removed a subscriber: 238482n375.
Krinkle closed this task as Declined.EditedDec 10 2019, 6:58 PM

As of MediaWiki 1.30, it is no longer supported to set wgShellLocale to something other than C.UTF-8 or equivalent with 8 character set and language-agnostic behaviour.

For localisation of the wiki, this setting does not need to be altered. MediaWiki has its own support systems for that. Including for example as used on Wikipedia. E.g. all language editions of Wikipedia as hosted by Wikimedia Foundation use $wgShellLocale = 'C.UTF-8';.