Page MenuHomePhabricator

PHP floor( log10() ) gives wrong results and need a replacement
Closed, ResolvedPublic

Description

Language::formatBitrate( 1000000000000000 ) gives '1,000Tbps' when it should be '1Pbps'.

The end result is 'languages/LanguageTest.php' test suite gives a failure:

  1. LanguageTest::testFormatBitrate with data set #5 (1000000000000000, '1Pbps', '1 petabit per second')

formatBitrate('1000000000000000'): 1 petabit per second
Failed asserting that two strings are equal.

  • Expected

+++ Actual
@@ @@
-'1Pbps'
+'1,000Tbps'

The root cause is using floor() on a float returned by log(). That might be rounded down to the wrong integer :-/

As an example on http://codepad.org/VBkuWR8m

print floor(log10(1000000000000000)) . "\n";
outputs 15, correct
print floor(log(1000000000000000,10)) . "\n";
outputs 14 (WRONG)

print floor(log10(1000)) . "\n";
outputs 3, correct
print floor(log( 1000, 10 )) . "\n";
outputs 2 (WRONG)

On a Mac with PHP:
PHP 5.3.6 with Suhosin-Patch (cli) (built: Sep 8 2011 19:34:00)
Copyright (c) 1997-2011 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2011 Zend Technologies

with Xdebug v2.1.2, Copyright (c) 2002-2011, by Derick Rethans

print floor(log10(1000000000000000)) . "\n";
outputs 14 (WRONG)
print floor(log(1000000000000000,10)) . "\n";
outputs 15, correct

print floor(log10(1000)) . "\n";
outputs 3, correct
print floor(log( 1000, 10 )) . "\n";
outputs 3, correct

r108284 used log( foo, 10) instead of log10() but that broke the continuous integration system which use Ubuntu :D


Version: 1.20.x
Severity: normal

Details

Reference
bz33571

Related Objects

StatusSubtypeAssignedTask
ResolvedNone
InvalidNone

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 12:01 AM
bzimport set Reference to bz33571.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #1)

That's not specific to Mac OS.

It seems to only do it on Mac OS... 5.3.8 on windows locally and 5.3.6 on ubuntu don't do it...

I see it in 5.3.8, 5.2.13, 4.4.9... (x86_64) probably depends of the arithmetic processor.
Is that windows/ubunto Intel or AMD?

Maybe we could always add an appropiate epsilon.

print floor(log(1000000000000000,10)+0.000000000000001);
15
print floor(log(1000000000000000,10)+0.0000000000000001);
14

(In reply to comment #3)

I see it in 5.3.8, 5.2.13, 4.4.9... (x86_64) probably depends of the arithmetic
processor.
Is that windows/ubunto Intel or AMD?

Maybe we could always add an appropiate epsilon.

print floor(log(1000000000000000,10)+0.000000000000001);
15
print floor(log(1000000000000000,10)+0.0000000000000001);
14

Mines Windows 7 x64 Intel

reedy@gallium:~$ uname -a
Linux gallium 2.6.32-35-server #78-Ubuntu SMP Tue Oct 11 16:26:12 UTC 2011 x86_64 GNU/Linux

reedy@gallium:~$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 30
model name : Intel(R) Xeon(R) CPU X3450 @ 2.67GHz

PHP log() and log10() are not reliable:

php -r 'printf("%.32f\n", log(pow(10,15),10) );'
15.00000000000000355271367880050093

php -r 'printf("%.32f\n", log10(pow(10,15)) );'
14.99999999999999822364316059974954

Both should give exactly 15.

php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
15.00000000000000177635683940025046

That last one should give 14.99<something>

So I think it is safer to reimplements floor(log10()) by avoiding float type entirely.

test of the C functions

php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
14.99999999999999822364316059974954

php -r 'printf("%.32f\n", log10(pow(10,15)) );'
15.00000000000000000000000000000000

php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
14.99999999999999822364316059974954

It's more or less normal to have a few differences between machines in floating point behavior http://c-faq.com/fp/strangefp.html

log(x), log10(x) and floor(x) are simple wrappers to the C functions. However, log(x,y) is log(x) / log(10), which is where I am getting the difference.

I think we could instead of floor use round(x, 0, PHP_ROUND_HALF_DOWN);
i.e. round(log(1000000000000000,10), 0, PHP_ROUND_HALF_DOWN) it has a number of checks to get the appropiate value.

Attached:

Isn't it pretty normal for floating point numbers to not exactly match decimal representations?

Sounds like a comparison somewhere is assuming that numbers will be exact decimal matches when they actually aren't, and needs to be more liberal in its comparisons.

Fixed in r108363.

Worked around log10 by simply duplicating/tweaking the very similar code in the function immediately below, which does the same thing with 1024 instead of 1000 unit sizes, uses only simple division, and passes the test cases.

Might want to merge those two into a common function and pass it its common parameters.