Page MenuHomePhabricator

Intermittent PHPUnit test failures when running RevisionCollectionPermissionsTest and others on 12.04
Closed, ResolvedPublic

Description

22:42:22 There was 1 failure:
22:42:22
22:42:22 1) Flow\Tests\Collection\RevisionCollectionPermissionsTest::testPermissions with data set #4 (UTFlowConfirmed, 'history', array(array(false), array(false)))
22:42:22 User UTFlowConfirmed should not be allowed action history on revision new-post : new-post:false suppress-post:false : {"rev_id":"s4mvjk0sgmczqez5","rev_user_id":1,"rev_user_ip":null,"rev_user_wiki":"build838-unittest_","rev_parent_id":null,"rev_change_type":"new-post","rev_type":"post","rev_type_id":"s4mvjjz9wb5avvwx","rev_content":"test content","rev_content_url":null,"rev_flags":"html","rev_mod_state":"","rev_mod_user_id":null,"rev_mod_user_ip":null,"rev_mod_user_wiki":null,"rev_mod_timestamp":null,"rev_mod_reason":null,"rev_last_edit_id":null,"rev_edit_user_id":null,"rev_edit_user_ip":null,"rev_edit_user_wiki":null,"tree_parent_id":null,"tree_rev_descendant_id":"s4mvjjz9wb5avvwx","tree_rev_id":"s4mvjk0sgmczqez5","tree_orig_user_id":1,"tree_orig_user_ip":null,"tree_orig_user_wiki":"build838-unittest_"}
22:42:22 Failed asserting that true matches expected false.
22:42:22
22:42:22 /srv/ssd/jenkins-slave/workspace/mwext-Flow-testextension/src/extensions/Flow/tests/phpunit/Collection/RevisionCollectionPermissionsTest.php:179
22:42:22 /srv/ssd/jenkins-slave/workspace/mwext-Flow-testextension/src/tests/phpunit/MediaWikiTestCase.php:141


Erik told me this is an intermittent failure, and we're not sure the cause yet.


Version: unspecified
Severity: normal

Details

Reference
bz72367

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:57 AM
bzimport set Reference to bz72367.
bzimport added a subscriber: Unknown Object (MLST).

Also seen at https://integration.wikimedia.org/ci/job/mwext-Flow-testextension/965/console

01:51:47 1) Flow\Tests\Collection\PostCollectionTest::testGetFirstRevision
01:51:47 Failed asserting that false is true.
01:51:47
01:51:47 /srv/ssd/jenkins-slave/workspace/mwext-Flow-testextension/src/extensions/Flow/tests/phpunit/Collection/PostCollectionTest.php:90
01:51:47 /srv/ssd/jenkins-slave/workspace/mwext-Flow-testextension/src/tests/phpunit/MediaWikiTestCase.php:141
01:51:47


The exact failure point changes, but is always about the collection's in one way or another

Was able to reproduce by booting a 12.04 instance in labs and using the jenkins repositories. process is roughly(from a newly booted 12.04 labs instance) as root:

mkdir -p /srv/deployment/integration/
cd /srv/deployment/integration/
git clone https://gerrit.wikimedia.org/r/p/integration/jenkins slave-scripts
git clone https://gerrit.wikimedia.org/r/p/integration/phpunit
mkdir -p /srv/ssd/jenkins-slave/tmpfs/mwext-Flow-testextension
cd /srv/ssd/jenkins-slave/tmpfs/mwext-Flow-testextension
git clone https://gerrit.wikimedia.org/r/p/mediawiki/core src
cd src
git clone https://gerrit.wikimedia.org/r/p/mediawiki/vendor
cd extensions
git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/Flow
git clone https://gerrit.wikimedia.org/r/p/mediawiki/extensions/Mantle
cd /srv/ssd/jenkins-slave/tmpfs/mwext-Flow-testextension
export WORKSPACE=/srv/ssd/jenkins-slave/tmpsfs/mwext-Flow-testextension
/srv/deployment/integration/slave-scripts/bin/mw-install-sqlite.sh
/srv/deployment/integration/slave-scripts/bin/mw-apply-settings.sh
/srv/deployment/integration/slave-scripts/bin/mw-run-update-script.sh

Finally you can run the tests a few times, you could use this script:

/srv/deployment/integration/slave-scripts/bin/mw-run-phpunit-allexts.sh

But to replicate you only need to run the tests:

cd /srv/ssd/jenkins-slave/tmpfs/mwext-Flow-testextension/src/tests/phpunit
php phpunit.php \
    --with-phpunitdir /srv/deployment/integration/phpunit/vendor/phpunit/phpunit \
    --testsuite extensions \
    --filter Permission

Since its an intermittent failure you need to run it in a loop

for i in {1..50};do php phpunit.php ...;done

Tried same process on 14.04 and this problem doesn't occur, almost certainly this will all happen on 12.04 without the jenkins scripts to set things up, but i havn't tested as they are rather convenient.

Still not sure whats causing the failure, but reproduction should help

When AbstractCollection::getAllRevisions() is called it runs a find query with sorting options. On the failed tests these sorting options are not being respected and the revisions are in the wrong order.

Still not sure how to fix, getting closer but its bed time :)

Heres a narrower reproduction case:

<?php 

use Flow\Model\UUID;

include __DIR__ . '/../../maintenance/commandLine.inc';
include __DIR__ . '/includes/Model/UUID.php';

$dbw = wfGetDB( DB_MASTER );
$dbw->query( 'CREATE TEMP TABLE some_test ( workflow_id binary(11) not null, PRIMARY KEY (workflow_id))' );

$dbw->insert( 'some_test', array(
  array (
    'workflow_id' => UUID::hex2bin( UUID::alnum2hex( 's57ovj7y1hbificr' ) ),
  ),
  array (
    'workflow_id' => UUID::hex2bin( UUID::alnum2hex( 's57ovj88ycd1std7' ) ),
  ),
) );
$res = $dbw->select(
    'some_test',
    array( '*' ),
    array(),
    __METHOD__,
    array( 'LIMIT' => 100, 'ORDER BY' => 'workflow_id DESC' )
);
foreach ( $res as $row ) {
    $row = (array)$row;
    $result[UUID::hex2alnum(UUID::bin2hex($row['workflow_id']))] = $row;
}
echo implode( ',', array_keys( $result ) ), "\n";
$sorted = $result;
krsort( $sorted );
if ( array_keys( $result ) !== array_keys( $sorted ) ) {
    echo "\nWrong order, received ascending workflow ids\n";
}

On 12.04 with sqlite it returns this:

s57ovj7y1hbificr,s57ovj88ycd1std7

Wrong order, received ascending workflow ids

On a vagrant instance(14.04 w/mysql):

s57ovj88ycd1std7,s57ovj7y1hbificr

So the problem is that for some set of ID's, the sqlite in 12.04 fails to sort them properly

So it turns out this bug only happens when our UUID's contain a null byte. The full and complete value is returned from a query, but it gets sorted wrong.

There is a bugfix in DatabaseSqlite::addQuotes() that says sqlite truncates data with null bytes unless quoted properly. My best guess is that although we inserted the data correctly(thanks to that bugfix), the truncation was still occurring within sqlite's sorting algo.

gerritadmin wrote:

Change 170356 had a related patch set uploaded by EBernhardson:
Fix intermittent test failures on 12.04

https://gerrit.wikimedia.org/r/170356

gerritadmin wrote:

Change 170396 had a related patch set uploaded by EBernhardson:
Always decode Blob objects from Database::decodeBlob

https://gerrit.wikimedia.org/r/170396

gerritadmin wrote:

Change 170356 merged by jenkins-bot:
Fix intermittent test failures on 12.04

https://gerrit.wikimedia.org/r/170356

gerritadmin wrote:

Change 171999 had a related patch set uploaded by EBernhardson:
Doc: Document problem sorting inserted data

https://gerrit.wikimedia.org/r/171999

gerritadmin wrote:

Change 171999 merged by jenkins-bot:
Doc: Document problem sorting inserted data

https://gerrit.wikimedia.org/r/171999

EBernhardson claimed this task.
EBernhardson triaged this task as Medium priority.
EBernhardson set Security to None.

Change 170396 merged by jenkins-bot:
Always decode Blob objects from Database::addQuotes

https://gerrit.wikimedia.org/r/170396