Page MenuHomePhabricator

LuaSandbox's Data type round-tripping [tests/datatypes.phpt] fails with PHP 7.2
Closed, ResolvedPublic

Description

(sidenote, can we have a separate luasandbox project?)

Debian unstable recently transitioned to PHP 7.2. All tests are passing, except for datatypes:

datatypes.diff
015+ object with numeric keys: array ( '0' => 'foo', '1' => 'bar', )
015- object with numeric keys: array ( 0 => 'foo', 1 => 'bar', )
datatypes.log
---- EXPECTED OUTPUT
null:                     NULL
int:                      123
long:                     17179869184
double:                   3.125
NAN:                      NAN
INF:                      INF
true:                     true
false:                    false
string:                   'foobar'
empty string:             ''
string containing NULs:   'foo' . "\0" . 'bar'
array:                    array ( 0 => 'foo', 1 => 'bar', )
associative array:        array ( 0 => 'foo', 'bar' => 'baz', )
object:                   array ( 'bar' => 'baz', 'foo' => 1, )
object with numeric keys: array ( 0 => 'foo', 1 => 'bar', )
array with reference:     array ( 0 => 42, )
---- ACTUAL OUTPUT
null:                     NULL
int:                      123
long:                     17179869184
double:                   3.125
NAN:                      NAN
INF:                      INF
true:                     true
false:                    false
string:                   'foobar'
empty string:             ''
string containing NULs:   'foo' . "\0" . 'bar'
array:                    array ( 0 => 'foo', 1 => 'bar', )
associative array:        array ( 0 => 'foo', 'bar' => 'baz', )
object:                   array ( 'bar' => 'baz', 'foo' => 1, )
object with numeric keys: array ( '0' => 'foo', '1' => 'bar', )
array with reference:     array ( 0 => 42, )
---- FAILED

Details

Event Timeline

Legoktm renamed this task from Data type round-tripping [tests/datatypes.phpt] fails with PHP 7.2 to LuaSandbox's Data type round-tripping [tests/datatypes.phpt] fails with PHP 7.2.Jan 31 2018, 3:55 AM

Hmm, I wonder whether they did that on purpose or not.

It turns out the test is already half bogus, thanks to crazy PHP behavior when dealing with objects and numeric keys.

$o = (object)[ 0 => "array 0", 1 => "array 1" ];
$o->{0} = "object 0";
$o->{2} = "object 2";
var_dump( $o, $o->{0}, $o->{"0"}, $o->{1}, $o->{"2"} );
$a = (array)$o;
var_dump( $a, $a[0], $a["0"], $a[1], $a["2"] );

In PHP < 7.2, the results are not sensible:

PHP Notice:  Undefined property: stdClass::$1 in Command line code on line 4
object(stdClass)#1 (4) {
  [0]=>
  string(7) "array 0"
  [1]=>
  string(7) "array 1"
  ["0"]=>
  string(8) "object 0"
  ["2"]=>
  string(8) "object 2"
}
string(8) "object 0"
string(8) "object 0"
NULL
string(8) "object 2"
PHP Notice:  Undefined offset: 2 in Command line code on line 6
array(4) {
  [0]=>
  string(7) "array 0"
  [1]=>
  string(7) "array 1"
  ["0"]=>
  string(8) "object 0"
  ["2"]=>
  string(8) "object 2"
}
string(7) "array 0"
string(7) "array 0"
string(7) "array 1"
NULL

And, to make things even more confusing, a serialize-unserialize fixes up the object but not the array.

The change in behavior in 7.2 makes it work sanely:

object(stdClass)#1 (3) {
  ["0"]=>
  string(8) "object 0"
  ["1"]=>
  string(7) "array 1"
  ["2"]=>
  string(8) "object 2"
}
string(8) "object 0"
string(8) "object 0"
string(7) "array 1"
string(8) "object 2"
array(3) {
  [0]=>
  string(8) "object 0"
  [1]=>
  string(7) "array 1"
  [2]=>
  string(8) "object 2"
}
string(8) "object 0"
string(8) "object 0"
string(7) "array 1"
string(8) "object 2"

And in Lua we only have tables to represent both arrays and objects, but table[1] and table["1"] are both accessible and are distinct.

That leaves us with several problematic cases. I don't think any of these actually occur in Scribunto in any meaningful manner.

  • What should we do when converting an object with integer keys to a table?
  • What should we do when converting an object with string keys that represent nonnegative integers to a table?
  • What should we do if someone gives us one of those screwed-up arrays?
  • When Lua tries to return a table with string keys that represent nonnegative integers, what should the resulting array look like?
    • Current behavior: PHP just doesn't see them.
    • If we decide to convert them to integers so PHP can access them, what then if Lua returns a table with both the integer and the string representation as keys?

I see the following options:

  1. Declare all of these "undefined behavior", and just remove the test in question.
  2. Make all of these raise errors.
  3. Convert int-strings to integers both ways. Collisions PHP→Lua are undefined behavior, Lua→PHP raise an error.
  4. Remove the ability to pass objects (other than LuaSandboxFunction) to Lua entirely, callers have to convert them to arrays themselves. Pick one of the above for the Lua→PHP bullet.

#1 is the easiest. The others will probably need changes in Scribunto's LuaStandalone to match.

#4 would also solve another issue: PHP7 apparently considers classes with defined fields as a different kind of object from the plain objects LuaSandbox works with and so rejects them entirely, while PHP5 and HHVM will happily expose private variables in such objects to Lua.

It turns out Scribunto's LuaStandalone already does #4, so I think I'll go with that.

The only question remaining, then, is what to do about Lua→PHP with int-string keys. The current behavior:

  • luasandbox returns an array with string keys, which PHP can't access properly.
  • LuaStandalone on HHVM returns[1] an array with string keys, which HHVM can't access properly.
  • LuaStandalone on Zend PHP (both 5 and 7) returns[1] an array with integer keys, which it can access properly.

[1]: More specifically, LuaStandalone builds a string that PHP passes to unserialize(), with the int-string encoded as a string. Zend PHP's unserialize() converts it to an int, while HHVM's doesn't.

Change 407455 had a related patch set uploaded (by Anomie; owner: Anomie):
[mediawiki/php/luasandbox@master] Remove PHP→Lua object conversion

https://gerrit.wikimedia.org/r/407455

I filed T186240 for the parts about int-strings, so this bug can concentrate on the object conversion issue.

Change 407455 merged by jenkins-bot:
[mediawiki/php/luasandbox@master] Remove PHP→Lua object conversion

https://gerrit.wikimedia.org/r/407455