Page MenuHomePhabricator

PHP Notice: Trying to get property 'num_rows' of non-object in /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/DatabaseMysqli.php on line 233 on PHP 7.2 travis builds
Closed, ResolvedPublicPRODUCTION ERROR

Description

https://travis-ci.org/wikimedia/mediawiki/jobs/415719055

PHP Notice:  Trying to get property 'num_rows' of non-object in /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/DatabaseMysqli.php on line 233
PHP Stack trace:
PHP   1. {main}() /home/travis/build/wikimedia/mediawiki/tests/phpunit/phpunit.php:0
PHP   2. require() /home/travis/build/wikimedia/mediawiki/tests/phpunit/phpunit.php:174
PHP   3. PHPUnitMaintClass->execute() /home/travis/build/wikimedia/mediawiki/maintenance/doMaintenance.php:94
PHP   4. PHPUnit\TextUI\Command::main() /home/travis/build/wikimedia/mediawiki/tests/phpunit/phpunit.php:135
PHP   5. PHPUnit\TextUI\Command->run() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/TextUI/Command.php:148
PHP   6. PHPUnit\TextUI\TestRunner->doRun() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/TextUI/Command.php:195
PHP   7. PHPUnit\Framework\TestSuite->run() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/TextUI/TestRunner.php:546
PHP   8. PHPUnit\Framework\TestSuite->run() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/Framework/TestSuite.php:755
PHP   9. PHPUnit\Framework\TestSuite->run() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/Framework/TestSuite.php:755
PHP  10. MediaWiki\Session\SessionInfoTest->run() /home/travis/build/wikimedia/mediawiki/vendor/phpunit/phpunit/src/Framework/TestSuite.php:755
PHP  11. MediaWiki\Session\SessionInfoTest->setUpSchema() /home/travis/build/wikimedia/mediawiki/tests/phpunit/MediaWikiTestCase.php:466
PHP  12. MediaWiki\Session\SessionInfoTest->undoSchemaOverrides() /home/travis/build/wikimedia/mediawiki/tests/phpunit/MediaWikiTestCase.php:1593
PHP  13. MediaWiki\Session\SessionInfoTest->recloneMockTables() /home/travis/build/wikimedia/mediawiki/tests/phpunit/MediaWikiTestCase.php:1579
PHP  14. CloneDatabase->cloneTableStructure() /home/travis/build/wikimedia/mediawiki/tests/phpunit/MediaWikiTestCase.php:1722
PHP  15. Wikimedia\Rdbms\DatabaseMysqli->duplicateTableStructure() /home/travis/build/wikimedia/mediawiki/includes/db/CloneDatabase.php:109
PHP  16. Wikimedia\Rdbms\DatabaseMysqli->query() /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/DatabaseMysqlBase.php:1454
PHP  17. Wikimedia\Rdbms\DatabaseMysqli->doProfiledQuery() /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/Database.php:1143
PHP  18. Wikimedia\Rdbms\DatabaseMysqli->numRows() /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/Database.php:1245
PHP  19. Wikimedia\Rdbms\DatabaseMysqli->mysqlNumRows() /home/travis/build/wikimedia/mediawiki/includes/libs/rdbms/database/DatabaseMysqlBase.php:370

I don't really understand how this notice is possible at all because this function is very clearly wrapped in suppressWarnings/restoreWarnings.

Event Timeline

It seems mysqli_result->num_rows is not set for CREATE statement (or other DDL)

No idea why suppression is not working

The new call of numRows is possible coming from https://gerrit.wikimedia.org/r/#/c/mediawiki/core/+/443405/

Change 452733 had a related patch set uploaded (by Aaron Schulz; owner: Aaron Schulz):
[mediawiki/core@master] Avoid numRows() warnings for mysqli after table creation

https://gerrit.wikimedia.org/r/452733

Change 452733 merged by jenkins-bot:
[mediawiki/core@master] rdbms: Avoid numRows() warnings for mysqli after table creation

https://gerrit.wikimedia.org/r/452733

RazeSoldier subscribed.

The PHP Notice still show in travis-CI and caused the log to exceed the maximum log length, causing the job process to be terminated.

No idea why suppression is not working

Maybe due to the return value of MWExceptionHandler::handleError() ignoring error reporting levels? The logging code does take them into account and does not write to the error log, but the return value determines whether PHP invokes the default error handler (which prints to stdout).

OTOH notices don't normally come with stack traces so that's probably not what's happening.

OTOH notices don't normally come with stack traces so that's probably not what's happening.

MediaWiki's own error handler creates traces, but as far as I know, we only print those to Monolog, not to standard out. So it's probably not ours indeed.

MediaWiki does configure PHPUnit to convert errors to exceptions. And when PHPUnit handles them, they do have stack traces. That might be what's happening here.

If that's the case, it still wouldn't explain why it only happens on PHP 7.2, though. Perhaps something changed in Zend that we have to update our error suppression/restoration logic for?

Krinkle triaged this task as High priority.EditedSep 23 2018, 4:39 AM

Actually, I'm more concerned with the error itself than the fact that it isn't being suppressed.

Given the ternary that was being introduced, this means we have a value being passed here that is "not an object" but also "thruthy". That seems bad, ver bad. In fact, it might mean that this code isn't working and just always returning zero.

In production I see a suspiciously high number of warnings from this code being logged. They are all in the suppressed "error-json" channel. But it's still bad, because that means we're always returning zero instead of the actual number of affected rows, which could be causing silently failures and corruption in many places.

includes/libs/rdbms
class DatabaseMysqlBase {
	/** @param ResultWrapper|resource */
	function numRows( $res ) {
		if ( $res instanceof ResultWrapper ) {
			$res = $res->result;
		}
		Wikimedia\suppressWarnings();
		$n = $res ? $this->mysqlNumRows( $res ) : 0;
		Wikimedia\restoreWarnings();
	}
}
class DatabaseMysqli {
	/** @param mysqli_result */
	function mysqlNumRows( $res ) {
		return $res->num_rows;
	}
}
Logstash error-json channel
{
 "type":"ErrorException",
 "message":"PHP Notice: Cannot access property on non-object",
 "code":0,
 "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/DatabaseMysqli.php",
 "line":233,

 "suppressed":true,

 "backtrace":[{
   "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/DatabaseMysqli.php","line":233,
   "class":"MWExceptionHandler", "type":"::", "function": "handleError",
   "args":["integer","string","string","integer","array","array"]
  },{
   "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/DatabaseMysqlBase.php","line":362,
   "class":"Wikimedia\\Rdbms\\DatabaseMysqli", "type":"->", "function":"mysqlNumRows",
   "args":["boolean"]
  },{
   "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/Database.php","line":1257,
    "function":"numRows","class":"Wikimedia\\Rdbms\\DatabaseMysqlBase",
    "type":"->","args":["boolean"]
  },{
   "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/Database.php","line":1155,
   "class":"Wikimedia\\Rdbms\\Database","type":"->","function":"doProfiledQuery",
   "args":["string","string","boolean","string"]
  },{
   "file":"/srv/mediawiki/php-1.32.0-wmf.22/includes/libs/rdbms/database/Database.php","line":3893,
   "class":"Wikimedia\\Rdbms\\Database","type":"->","function":"query",

This says Database->doProfiledQuery is calling DatabaseMysqlBase->numRows with a boolean, which then ends up calling DatabaseMysqli->mysqlNumRows().

Given the ternary, this must mean that the boolean value is boolean true.

The value that Database->doProfiledQuery passes to DatabaseMysqlBase->numRows comes from DatabaseMysqli->doQuery. That method is supposed to return mysqli_result or false. It is not expected to return true. Yet, the PHP docs disagree with ours:

Returns false on failure. For successful SELECT.... queries mysqli_query() will return a mysqli_result object. For other successful queries mysqli_query() will return true.

On the other hand, MediaWiki only calls numRows if the query was a read query. If it was a write query, it calls affectedRows instead of numRows.

includes/lib/rdbms
# Database::doProfiledQuery

	$isWrite ? $this->affectedRows() : $this->numRows( $ret )

So, we're looking for a query that is:

  • .. common on Special:Preferences in production,
  • .. common in our unit tests,
  • .. makes mysqli_query() return true (so not an obvious read query),
  • .. makes MediaWiki $isWrite set to false (so not an obvious write query),

And on top of that, this query is happening in a context that in production (on HHVM) emits a PHP Notice that is correctly suppressed, but with PHPUnit on PHP 7.2, is not being suppressed.

Thoughts?

I did a bit of debugging:

diff --git a/includes/libs/rdbms/database/Database.php b/includes/libs/rdbms/database/Database.php
index e276d09992..3a87c46d74 100644
--- a/includes/libs/rdbms/database/Database.php
+++ b/includes/libs/rdbms/database/Database.php
@@ -1250,6 +1250,17 @@ abstract class Database implements IDatabase, IMaintainableDatabase, LoggerAware
                        $this->rttEstimate = $queryRuntime;
                }
 
+               if ( !$isWrite ) {
+                       if ( $ret instanceof ResultWrapper ) {
+                               $realRes = $ret->result;
+                               if ( $realRes === true ) {
+                                       wfDebugLog( 'test', "non-write but \$ret->result is true!: $sql" );
+                               }
+                       } elseif ( $ret === true ) {
+                               wfDebugLog( 'test', "non-write but \$ret is true!: $sql" );
+                       }
+               }
+
                $this->trxProfiler->recordQueryCompletion(
                        $queryProf,
                        $startTime,

Based on the debug logs, my guess is that the issue is relating to temporary tables, in which case $isWrite will be false, since it's not a real write query, but in terms of affected rows / num rows, it *is* a write query.

T202806 does not involve any temporary tables, though.

T202806 does not involve any temporary tables, though.

In that case, the query triggering the error seems to be COMMIT.

Change 463577 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] rdbms: Document a bunch of stuff about query verbs

https://gerrit.wikimedia.org/r/463577

Change 463577 merged by jenkins-bot:
[mediawiki/core@master] rdbms: Document a bunch of stuff about query verbs

https://gerrit.wikimedia.org/r/463577

Change 463822 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/core@master] Don't run numRows on a "true" result

https://gerrit.wikimedia.org/r/463822

Change 463871 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/core@master] Database result type is mostly opaque to our classes

https://gerrit.wikimedia.org/r/463871

Change 465114 had a related patch set uploaded (by Tim Starling; owner: Tim Starling):
[mediawiki/core@master] Fix guarding of MySQL's numRows()

https://gerrit.wikimedia.org/r/465114

Change 465114 merged by jenkins-bot:
[mediawiki/core@master] Fix guarding of MySQL's numRows()

https://gerrit.wikimedia.org/r/465114

Change 465295 had a related patch set uploaded (by Krinkle; owner: Tim Starling):
[mediawiki/core@wmf/1.32.0-wmf.24] Fix guarding of MySQL's numRows()

https://gerrit.wikimedia.org/r/465295

Change 465295 merged by jenkins-bot:
[mediawiki/core@wmf/1.32.0-wmf.24] Fix guarding of MySQL's numRows()

https://gerrit.wikimedia.org/r/465295

Mentioned in SAL (#wikimedia-operations) [2018-10-09T00:04:00Z] <krinkle@deploy1001> Synchronized php-1.32.0-wmf.24/includes/libs/rdbms/database: T201900 - I8ae754a2518 (duration: 00m 59s)

Krinkle reassigned this task from Krinkle to tstarling.
Krinkle removed a project: Patch-For-Review.

Change 463871 abandoned by Aaron Schulz:
Database result type is mostly opaque to our classes

Reason:
Superseded by another patch

https://gerrit.wikimedia.org/r/463871

Change 463822 abandoned by Aaron Schulz:
Guard numRows, not appropriate for some queries

Reason:
Superseded by another patch

https://gerrit.wikimedia.org/r/463822

Change 463871 restored by Aaron Schulz:
Database result type is mostly opaque to our classes

Reason:
Gah, I meant the one in another browser tab

https://gerrit.wikimedia.org/r/463871

Change 463871 abandoned by Awight:
Database result type is mostly opaque to our classes

https://gerrit.wikimedia.org/r/463871

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:09 PM