Page MenuHomePhabricator

mediawiki-extensions-hhvm: MessageGroupStatesUpdaterJobTest::testHooks is intermittent failing
Closed, ResolvedPublic

Description

The mediawiki-extensions-hhvm test suite is currently intermittent failing, rechecking a change mostly solves the problem.

There was 1 failure:

1) MessageGroupStatesUpdaterJobTest::testHooks
in progress after first translation
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'inprogress'
+'ready'

/mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/extensions/Translate/tests/phpunit/MessageGroupStatesUpdaterJobTest.php:127
/mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/tests/phpunit/MediaWikiTestCase.php:141

I found that occurences so far in the timeline:

both (or 3, if you count also https://integration.wikimedia.org/ci/job/mediawiki-extensions-hhvm/2200/ ) were happening on integration-slave1009

Event Timeline

Se4598 raised the priority of this task from to Needs Triage.
Se4598 updated the task description. (Show Details)
Se4598 subscribed.

now (up to build 2230) four ones: 2217, 2224, 2227 and 2230, where the two latter also has a mw-error.log (something about implicit commit transaction), all still on integration-slave1009

I have added Translate to the shared job on Feb 4th 7am UTC: https://gerrit.wikimedia.org/r/#/c/188518/ T86930

Seems some test has a race condition or integration-slave1009 has some unique issue :-/ Maybe get Translate out while it is being investigated? Reverting https://gerrit.wikimedia.org/r/#/c/188518/ refreshing the two jobs and reloading zuul should be sufficient.

I'm going ahead and reverting 5a7beb00907671d35bdc71. This has been causing build failures in unrelated extensions for long enough.

Thanks to @Krinkle, Translate is no more in the shared job ( https://gerrit.wikimedia.org/r/#/c/188735/ ).

Now we have to figure out what is going on with the Translate test that causes it to fails intermittently :-(

hashar set Security to None.

I have removed the HHVM project tag, but maybe it was happening on the Zend PHP as well? If anyone can confirm it would be nice.

This? Cf. T42451

2015-02-04 16:48:35 integration-slave1009 build2230: [d25e942d] [no req]   ErrorException from line 300 of /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php: PHP Notice: DatabaseBase::begin: Transaction already in progress (from DatabaseUpdater::doUpdates),  performing implicit commit! [Called from DatabaseBase::begin in /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/db/Database.php at line 3555]
#0 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php(300): MWExceptionHandler::handleError()
#1 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php(155): MWDebug::sendMessage()
#2 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/GlobalFunctions.php(1157): MWDebug::warning()
#3 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/db/Database.php(3555): wfWarn()
#4 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/extensions/CheckUser/install.inc(40): DatabaseBase->begin()
#5 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/extensions/CheckUser/CheckUser.hooks.php(320): create_cu_changes()
#6 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/installer/DatabaseUpdater.php(443): CheckUserHooks::checkUserCreateTables()
#7 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/installer/DatabaseUpdater.php(408): DatabaseUpdater->runUpdates()
#8 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/update.php(163): DatabaseUpdater->doUpdates()
#9 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/doMaintenance.php(101): UpdateMediaWiki->execute()
#10 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/update.php(207): include()
#11 {main}
2015-02-04 16:48:35 integration-slave1009 build2230: [2dee33d4] [no req]   ErrorException from line 300 of /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php: PHP Notice: DatabaseUpdater::doUpdates: No transaction to commit, something got out of sync! [Called from DatabaseBase::commit in /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/db/Database.php at line 3640]
#0 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php(300): MWExceptionHandler::handleError()
#1 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/debug/MWDebug.php(155): MWDebug::sendMessage()
#2 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/GlobalFunctions.php(1157): MWDebug::warning()
#3 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/db/Database.php(3640): wfWarn()
#4 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/includes/installer/DatabaseUpdater.php(423): DatabaseBase->commit()
#5 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/update.php(163): DatabaseUpdater->doUpdates()
#6 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/doMaintenance.php(101): UpdateMediaWiki->execute()
#7 /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/maintenance/update.php(207): include()
#8 {main}

all still on integration-slave1009

https://wikitech.wikimedia.org/wiki/Nova_Resource:I-000006a4.eqiad.wmflabs says "building" (since October?). Slow disk?

The sqlite database is on a tmpfs (RAM disk) so it is fast enough, anyway the test should take care of it.

I have no idea what the MessageGroupStatesUpdaterJobTest is doing but it seems some edit does not register properly :-(

I ran Translate phpunit tests locally in a loop for 10 minutes or so. This failure did not show up.

@Nikerabbit that must be a bad interaction with one of the other extensions :-/ For history purpose the job had the following extensions cloned together:

mediawiki/extensions/AbuseFilter
mediawiki/extensions/Babel
mediawiki/extensions/CheckUser
mediawiki/extensions/cldr
mediawiki/extensions/ConfirmEdit
mediawiki/extensions/Echo
mediawiki/extensions/EventLogging
mediawiki/extensions/Flow
mediawiki/extensions/JsonConfig
mediawiki/extensions/Mantle
mediawiki/extensions/MobileApp
mediawiki/extensions/MobileFrontend
mediawiki/extensions/SpamBlacklist
mediawiki/extensions/Thanks
mediawiki/extensions/Translate
mediawiki/extensions/UniversalLanguageSelector
mediawiki/extensions/VisualEditor
mediawiki/extensions/WikiGrok
mediawiki/extensions/ZeroBanner
mediawiki/extensions/ZeroPortal

The extensions are listed in a file named extensions_load.txt which is used by the CI autoloader to include() them. So the UnitTestList hook array should have the same order. Does not explain why it would fail only on integration-slave1009 though :-/

The test was recently marked as @ broken https://gerrit.wikimedia.org/r/#/c/203455/ as it was failing too often. I was unable to figure out why it fails even with some debug output added.

Nikerabbit lowered the priority of this task from High to Low.May 18 2015, 12:50 PM
hashar claimed this task.

For anyone reading this, the test is still marked as broken and not fixed.