Page MenuHomePhabricator

Wikibase QUnit failure blocks Content translation CI
Closed, ResolvedPublic

Description

Started from https://integration.wikimedia.org/ci/job/mwext-qunit/8665/console

05:55:30 Chromium 45.0.2454 (Ubuntu 0.0.0) wikibase.getLanguageNameByCode wikibase.getLanguageNameByCode() FAILED
05:55:30 	getLanguageNameByCode() returns language name.
05:55:30 	Expected: Deutsch
05:55:30 	Actual: German

https://gerrit.wikimedia.org/r/256895 and changes after that are affected

Event Timeline

santhosh raised the priority of this task from to Needs Triage.
santhosh updated the task description. (Show Details)
santhosh added subscribers: santhosh, KartikMistry.
santhosh set Security to None.

Looking at the mwext-qunit console log for change https://gerrit.wikimedia.org/r/#/c/256895/ , we do query Special:BlankPage for debugging purposes: curl --include http://localhost:9412/jenkins-mwext-qunit-8665/index.php/Special:BlankPage

That yields:

Notice: Cannot find site jenkins_u2_mw in sites table
   [Called from Wikibase\Client\WikibaseClient::newSiteGroup
    in extensions/Wikidata/extensions/Wikibase/client/includes/WikibaseClient.php at line 546]
   in includes/debug/MWDebug.php on line 300

jenkins_u2_mw is the $wgDBName forged by the job.

MediaWiki Debug and error logs are attached to the build report https://integration.wikimedia.org/ci/job/mwext-qunit/8665/ . The error log has a couple stack traces:

2015-12-04 05:54:21 integration-slave-trusty-1012 jenkins_u2_mw:
[eda5a40d] /jenkins-mwext-qunit-8665/index.php/Special:BlankPage
   ErrorException from line 1578 of skins/BlueSky/BlueSky.skin.php: PHP Warning: Invalid argument supplied for foreach()
#0 skins/BlueSky/BlueSky.skin.php(1578): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 includes/skins/SkinTemplate.php(242): BlueSkyTemplate->execute()
#2 includes/OutputPage.php(2321): SkinTemplate->outputPage()
#3 includes/MediaWiki.php(690): OutputPage->output()
#4 includes/MediaWiki.php(474): MediaWiki->main()
#5 index.php(43): MediaWiki->run()
#6 {main}

2015-12-04 05:54:21 integration-slave-trusty-1012 jenkins_u2_mw:
[ac50f548] /jenkins-mwext-qunit-8665/index.php/Special:BlankPage
  ErrorException from line 593 of includes/skins/BaseTemplate.php: PHP Notice: Undefined index: copyright
#0 includes/skins/BaseTemplate.php(593): MWExceptionHandler::handleError(integer, string, string, integer, array)
#1 skins/BlueSky/BlueSky.skin.php(2138): BaseTemplate->getFooterIcons(string)
#2 includes/skins/SkinTemplate.php(242): BlueSkyTemplate->execute()
#3 includes/OutputPage.php(2321): SkinTemplate->outputPage()
#4 includes/MediaWiki.php(690): OutputPage->output()
#5 includes/MediaWiki.php(474): MediaWiki->main()
#6 index.php(43): MediaWiki->run()
#7 {main}

The BlueSky skin should really not be around. That is a bug in CI which populate the skin and leave it behind between unrelated jobs, since MediaWiki autoload skins subsequent runs of the job on that slave ends up having BlueSkin loaded :(

I could be wrong, but these are all appear to be notices and warnings and I think are probably not the problem. Of course, it would be nice to not have these issues during the tests

I have cleaned the workspaces with:

salt --show-timeout '*' cmd.run 'rm -fR /mnt/jenkins-workspace/workspace/mwext-qunit/src/skins/*'

I retriggered a build ( https://integration.wikimedia.org/ci/job/mwext-qunit/8672/ ) and SpecialBlankpage yields a warning potentially related to DELETE FROM msg_resource`. The HTML output state there is a warning-box and server side log has:

0.0158   1.8M  Start request GET /jenkins-mwext-qunit-8672/index.php/Special:BlankPage
HTTP HEADERS:
USER-AGENT: curl/7.35.0
HOST: localhost:9412
ACCEPT: */*
[caches] cluster: EmptyBagOStuff, WAN: mediawiki-main-default, stash: db-replicated, message: SqlBagOStuff, parser: SqlBagOStuff
[caches] LocalisationCache: using store LCStoreCDB
0.0229   2.2M  Fully initialised
0.0328   4.2M  Dependency triggered: /mnt/jenkins-workspace/workspace/mwext-qunit/src/skins/Example/i18n/en.json deleted.
0.0330   4.2M  LocalisationCache::isExpired(en): cache for en expired due to FileDependency
0.0371   4.0M  LocalisationCache::recache: got localisation for en from source
0.1855   5.8M  IP: 127.0.0.1
[connect] Connected to database 0 at 127.0.0.1:3306
[DBPerformance] Expectation (masterConns <= 0) by MediaWiki::main not met:
[connect to 127.0.0.1:3306 (jenkins_u3_mw)]
TransactionProfiler.php line 311 calls wfBacktrace()
TransactionProfiler.php line 146 calls TransactionProfiler->reportExpectationViolated()
LoadBalancer.php line 573 calls TransactionProfiler->recordConnection()
GlobalFunctions.php line 3564 calls LoadBalancer->getConnection()
MessageBlobStore.php line 259 calls wfGetDB()
LocalisationCache.php line 1032 calls MessageBlobStore->clear()
LocalisationCache.php line 463 calls LocalisationCache->recache()
LocalisationCache.php line 337 calls LocalisationCache->initLanguage()
LocalisationCache.php line 274 calls LocalisationCache->loadItem()
Language.php line 3344 calls LocalisationCache->getItem()
SpecialPageFactory.php line 274 calls Language->getSpecialPageAliases()
SpecialPageFactory.php line 334 calls SpecialPageFactory::getAliasList()
Title.php line 1092 calls SpecialPageFactory::resolveAlias()
MediaWiki.php line 182 calls Title->isSpecial()
MediaWiki.php line 682 calls MediaWiki->performRequest()
MediaWiki.php line 474 calls MediaWiki->main()
index.php line 43 calls MediaWiki->run()

[DBPerformance] Expectation (writes <= 0) by MediaWiki::main not met:
query-m: DELETE FROM `msg_resource` [TRX#760bec2d59fc]
TransactionProfiler.php line 311 calls wfBacktrace()
TransactionProfiler.php line 200 calls TransactionProfiler->reportExpectationViolated()
Database.php line 1006 calls TransactionProfiler->recordQueryCompletion()
Database.php line 2951 calls DatabaseBase->query()
MessageBlobStore.php line 260 calls DatabaseBase->delete()
LocalisationCache.php line 1032 calls MessageBlobStore->clear()
LocalisationCache.php line 463 calls LocalisationCache->recache()
LocalisationCache.php line 337 calls LocalisationCache->initLanguage()
LocalisationCache.php line 274 calls LocalisationCache->loadItem()
Language.php line 3344 calls LocalisationCache->getItem()
SpecialPageFactory.php line 274 calls Language->getSpecialPageAliases()
SpecialPageFactory.php line 334 calls SpecialPageFactory::getAliasList()
Title.php line 1092 calls SpecialPageFactory::resolveAlias()
MediaWiki.php line 182 calls Title->isSpecial()
MediaWiki.php line 682 calls MediaWiki->performRequest()
MediaWiki.php line 474 calls MediaWiki->main()
index.php line 43 calls MediaWiki->run()

[connect] Connected to database 0 at 127.0.0.1:3306
[SQLBagOStuff] Connection 101415 will be used for SqlBagOStuff
[MessageCache] MessageCache::load: Loading en... local cache is empty, global cache is expired/volatile, loading from database
0.2094   6.5M  Unstubbing $wgParser on call of $wgParser::firstCallInit from MessageCache->getParser
0.2101   6.5M  Parser: using preprocessor: Preprocessor_DOM
0.2137   6.8M  Unstubbing $wgLang on call of $wgLang::_unstub from ParserOptions->__construct
0.2228   7.0M  MediaWiki::preOutputCommit completed; all transactions committed
[Preprocessor] Cached preprocessor output (key: jenkins_u3_mw:preprocess-xml:9ce65e223f47a410f511ade31138f211:0)
[Preprocessor] Cached preprocessor output (key: jenkins_u3_mw:preprocess-xml:9ce65e223f47a410f511ade31138f211:0)
0.3103   8.0M  OutputPage::sendCacheControl: no caching **
0.3134   8.0M  LoadBalancer::reuseConnection: this connection was not opened as a foreign connection
0.3138   8.0M  Request ended normally
`

I reused a https://gerrit.wikimedia.org/r/#/c/248855/ dummy change against ContentTranslation to retry the build. Failed build is https://integration.wikimedia.org/ci/job/mwext-qunit/8673/ and has:

00:01:21.503 	getLanguageNameByCode() returns language name.
00:01:21.503 	Expected: Deutsch
00:01:21.504 	Actual: German

The debug log still shows BlueSkin grrr https://integration.wikimedia.org/ci/job/mwext-qunit/8673/artifact/log/mw-debug-www.log/*view*/

So we have:

Dependency triggered: /mnt/jenkins-workspace/workspace/mwext-qunit/src/skins/BlueSky/i18n/en.json deleted.

Which suggests the LocalisationCache (using store LCStoreCDB) is not properly cleared out at the end / before the build. Though we do clear out the database via /srv/deployment/integration/slave-scripts/bin/mw-teardown-mysql.sh.

Seems the qunit job has the localisation cache written to /tmp, files belonging to www-data:

/tmp/l10n_cache-ar.cdb
/tmp/l10n_cache-en.cdb
/tmp/l10n_cache-fr.cdb
/tmp/l10n_cache-hi.cdb
/tmp/l10n_cache-ml.cdb
/tmp/l10n_cache-nl.cdb
/tmp/l10n_cache-zh.cdb
/tmp/l10n_cache-zh-hans.cdb

Blocks CX code merge, so High priority!

The l10n cache issue being solved, I gave a try again on a dummy change for ContentTranslation https://gerrit.wikimedia.org/r/#/c/248855/

The job mwext-qunit still fails. It has the following extensions:

cldr
ContentTranslation
Echo
Elastica
EventLogging
GeoData
GuidedTour
Scribunto
UniversalLanguageSelector
Wikidata

mediawiki-extensions-qunit is a job shared by some other extensions but does not have ContentTranslation. By commenting 'check experimental' on a ContentTranslation change the extension is injected in addition of the others. The job pass and in this context had the following extensions:

AbuseFilter
Babel
Cards
CheckUser
Cite
cldr
ConfirmEdit
ContentTranslation
Echo
Elastica
EventLogging
Flow
Gather
GlobalCssJs
GuidedTour
JsonConfig
MobileApp
MobileFrontend
MwEmbedSupport
ParserFunctions
SandboxLink
SpamBlacklist
Thanks
TimedMediaHandler
UniversalLanguageSelector
VisualEditor
ZeroBanner
ZeroPortal

Note how Wikidata is NOT included.

On IRC, @aude pointed at T117886: [Task] Don't fail when running job mediawiki-extensions-qunit with Wikidata

@adrianheine said that these are probably the same failures that happen in Wikidata related qunit tests when UniversalLanguageSelector is enabled.

For ContentTranslation, the mwext-qunit job has both ULS and Wikidata and that ends up causing the failure.

Unrelated issues were:

  • BlueSkins being left in the workspace (manually cleaned up)
  • localisation cache being shared between all builds of any jobs on a given instance (fixed by setting $wgTmpDirectory properly)

Then the Deutsch vs German issue is due to a bad interaction between Wikidata and ULS. Wikidata Jenkins job do not have ULS but ContentTranslation has both and hence this bug.

The issue has already been identified previously with T92532

@aude kindly made a monkey patch that workaround the Wikidata / ULS incompatibility:

Make getLanguageNameByCode qunit test work with ULS + cldr enabled
https://gerrit.wikimedia.org/r/#/c/256936/2

A new Wikidata build has been created b781d360c199941d8fcfd0b0aa084de8982fba83 which incorporate the monkey patch.

A ContentTranslation patch https://gerrit.wikimedia.org/r/#/c/256895/ has successfully passed the qunit job.

So this task is complete. Polishing up the Wikidata/ULS hack will be done as part of T92532.

hashar added a project: Essential-Work.