Page MenuHomePhabricator

Scap mwscript rebuildLocalisationCache.php fails
Closed, ResolvedPublic

Description

Error message is CentralIdLookup::resetCache may only be called during bootstrapping unit tests

Past few beta-scap-eqiad builds have failed with this message: https://integration.wikimedia.org/ci/job/beta-scap-eqiad/97768/console

Complete stack trace P2884

Event Timeline

thcipriani renamed this task from beta-scap-eqiad scap fails with CentralIdLookup::resetCache may only be called during bootstrapping unit tests to Scap mwscript rebuildLocalisationCache.php fails.Apr 11 2016, 7:11 PM
thcipriani updated the task description. (Show Details)

Change 282749 had a related patch set uploaded (by Thcipriani):
Fix RebuildLocalisationCache bug from MediaWikiServices

https://gerrit.wikimedia.org/r/282749

Change 282749 merged by jenkins-bot:
Fix RebuildLocalisationCache bug from MediaWikiServices

https://gerrit.wikimedia.org/r/282749

thcipriani assigned this task to bd808.

Sorry for the oversight regarding MW_SERVICE_BOOTSTRAP_COMPLETE.

I don't quite understand the assertion that "forked children do not need separate service connections". My understanding is that forked children should *never* share a network connection (socket) with the parent process (or would need to implement locking for access to the socket). That means that after forking, the network connection, and any connections to other services like memcached, need to be closed and re-created. If we don't do this, the protocol on these connections may get completely garbled. We may often get lucky and not have problems, but things would just blow up randomly.

It seems to be that after forking, we should indeed establish "separate service connections". Am I missing something here?

I re-submitted to offending patch and will work on it to fix the issue with MW_SERVICE_BOOTSTRAP_COMPLETE.

See https://gerrit.wikimedia.org/r/#/c/283462/1

@daniel my statements about rebuildLocalisationCache.php may be limited to the WMF production cluster usage. In the WMF environment we only read from disk and write to CDB so there is no shared connection to be garbled. If you were targeting a different backend then I guess you may need to create a separate database connection or something. It is not entirely clear to me how the environment of the parent propagates to the forked child in PHP with respect to resource handles and other objects that are in scope in the parent prior to forking. There probably was a time that I understood this, but pcntl_fork() is little enough used and dark enough magic that I haven't held on to the details.

@bd808 Ok. But doing a service reset isn't going to do any harm either. the children will just never use any of these services, so they never get re-created. So the issue was just my over-zealous check against MW_SERVICE_BOOTSTRAP_COMPLETE. This is fixed now in the new version of the patch which I linked above.