Page MenuHomePhabricator

Archive reindex gets stuck
Closed, ResolvedPublic

Description

After reindexing wiki archive, the forceSearchIndex.php script gets stuck. E.g. running the command on terbium:

mwscript extensions/CirrusSearch/maintenance/forceSearchIndex.php --wiki arwiki --cluster eqiad --archive

the script reaches the end with:

[              arwiki] Archived 100 pages ending at 2014-12-11T00:14:02Z at 887/second
[              arwiki] Archived 83 pages ending at 2017-03-15T21:53:11Z at 887/second
Archived a total of 2186283 pages at 887/second

and then the script just sits there doing nothing, without exiting.

Details

Related Gerrit Patches:
mediawiki/extensions/CirrusSearch : masterDisable stats collection for maintenance scripts.
mediawiki/extensions/CirrusSearch : wmf/1.30.0-wmf.2Disable stats collection for maintenance scripts.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Smalyshev triaged this task as High priority.May 12 2017, 9:41 PM
EBernhardson added a comment.EditedMay 12 2017, 10:00 PM

My first thought was it has to be from the next line after printing, $this->waitForQueueToDrain( $wiki );, except that will print a status report before it starts waiting. There really isn't anything else that happens after that, the execute() function finishes and the script should exit. Very odd.

Yeah, odd, I suspect it is related to the change we did to indexing, because it didn't happen before, but I can't figure out why would it happen.

Smalyshev added a comment.EditedMay 24 2017, 7:21 PM

Collected this backtrace from stuck index:

bt
#0  array_reduce (Array
(
    [0] => "MediaWiki.CirrusSearch.connectionPool.initMs:0|ms"
    [1] = ...(omitted), "self::doReduce", Array
(
    [0] => "MediaWiki.CirrusSearch.connectionPool.initMs:0|ms\nMediaWiki ...(omitted))
    at /srv/mediawiki/php-1.30.0-wmf.1/vendor/liuggio/statsd-php-client/src/Liuggio/StatsdClient/StatsdClient.php:94
#1  Liuggio\StatsdClient\StatsdClient::reduceCount (Array
(
    [0] => "MediaWiki.CirrusSearch.connectionPool.initMs:0|ms"
    [1] = ...(omitted))
    at /srv/mediawiki/php-1.30.0-wmf.1/includes/libs/stats/SamplingStatsdClient.php:104
#2  SamplingStatsdClient::send (Array
(
    [0] => "MediaWiki.CirrusSearch.connectionPool.initMs:0|ms"
    [1] = ...(omitted))
    at /srv/mediawiki/php-1.30.0-wmf.1/includes/GlobalFunctions.php:1204
#3  wfLogProfilingData ()
    at /srv/mediawiki/php-1.30.0-wmf.1/maintenance/doMaintenance.php:122
#4  include ("/srv/mediawiki/php-1.30.0-wmf.1/maintenance/doMaintenance.php")
    at /srv/mediawiki/php-1.30.0-wmf.1/extensions/CirrusSearch/maintenance/forceSearchIndex.php:609
#5  include ("/srv/mediawiki/php-1.30.0-wmf.1/extensions/CirrusSearch/maintenance/forceSearch ...(omitted))
    at /srv/mediawiki/multiversion/MWScript.php:99

Looks like having something to do with stats. Which I'm not sure should be happening at all on archive indexing? Looks like something there takes tons of time. Happens only on big reindexes, not e.g. testwiki.

Checking arguments for StatsdClient, we have data array of 2092917 elements there. Not sure what's going on, but probably should not be happening?

Change 355485 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/CirrusSearch@master] Disable stats collection for maintenance scripts.

https://gerrit.wikimedia.org/r/355485

Change 355485 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Disable stats collection for maintenance scripts.

https://gerrit.wikimedia.org/r/355485

Change 355494 had a related patch set uploaded (by Smalyshev; owner: Smalyshev):
[mediawiki/extensions/CirrusSearch@wmf/1.30.0-wmf.2] Disable stats collection for maintenance scripts.

https://gerrit.wikimedia.org/r/355494

Change 355494 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@wmf/1.30.0-wmf.2] Disable stats collection for maintenance scripts.

https://gerrit.wikimedia.org/r/355494

Unfortunately, approach in the patch does not work:

PHP Fatal error:  Call to undefined method NullStatsdDataFactory::getBuffer() in /srv/mediawiki/php-1.30.0-wmf.2/includes/GlobalFunctions.php on line 1203
Fatal error: Call to undefined method NullStatsdDataFactory::getBuffer() in /srv/mediawiki/php-1.30.0-wmf.2/includes/GlobalFunctions.php on line 1203

Will need to find a better way.

Smalyshev closed this task as Resolved.Jun 1 2017, 6:55 PM
Smalyshev claimed this task.