Page MenuHomePhabricator

Fix completion suggester scoring method for very small wikis
Closed, ResolvedPublic

Description

The score is sometimes completely wrong for some wikis (certainly those which have 0 doc in the content index but with a cross namespace redirect in the general index).
Max docs must be set to 0 causing some problems in the formula.

Wikis affected are :

  • bowiktionary_titlesuggest
  • rmwiktionary_titlesuggest

Logs from elastic:

[2016-06-03 13:31:51,492][DEBUG][action.bulk              ] [elastic2012] [bowiktionary_titlesuggest_1464960399][0] failed to execute bulk item (index) index {[bowiktionary_titlesuggest_1464960399][titlesuggest][t1954], source[{"batch_id":1464960407,"suggest":{"input":["Main Page"],"output":"1954:t:Main Page","weight":-46116860184273880},"suggest-stop":{"input":["Main Page"],"output":"1954:t:Main Page","weight":-46116860184273880}}]}
MapperParsingException[failed to parse]; nested: IllegalArgumentException[Weight must be in the interval [0..2147483647], but was [-46116860184273880]];

Logs from terbium:

Scanning available plugins...
	analysis-icu, experimental-highlighter, extra, swift-repository
Picking analyzer...default
Fetching Elasticsearch version...2.3.3...ok
2016-06-03 13:26:31 Deleting broken index bowiktionary_titlesuggest_1464960348
Inferring index identifier...bowiktionary_titlesuggest_1458272936
Refresh interval is not -1, cannot recycle.
Inferring index identifier...bowiktionary_titlesuggest_1458272936
Setting index identifier...bowiktionary_titlesuggest_1464960399
2016-06-03 13:26:41 Waiting for the index to go green...
	Index is red retrying...
	Green!
2016-06-03 13:26:47 Setting max_docs to 0
2016-06-03 13:26:47 Indexing 0 documents from content (0 in the index) with batchId: 1464960407 and scoring method: popqual
2016-06-03 13:26:47 Indexing from content index done.
2016-06-03 13:26:47 Indexing 1 documents from general (1 in the index) with batchId: 1464960407 and scoring method: popqual
	100% done...


2016-06-03 13:31:51 
Unexpected Elasticsearch failure.
Elasticsearch failed in an unexpected way.  This is always a bug in CirrusSearch.
Error type: Elastica\Exception\Bulk\ResponseException
Message: unknown: Error in one or more bulk request actions:

index: /bowiktionary_titlesuggest_1464960399/titlesuggest/t1954 caused failed to parse

Trace:
#0 /srv/mediawiki/php-1.28.0-wmf.4/vendor/ruflin/elastica/lib/Elastica/Bulk.php(360): Elastica\Bulk->_processResponse(Object(Elastica\Response))
#1 /srv/mediawiki/php-1.28.0-wmf.4/vendor/ruflin/elastica/lib/Elastica/Client.php(320): Elastica\Bulk->send()
#2 /srv/mediawiki/php-1.28.0-wmf.4/vendor/ruflin/elastica/lib/Elastica/Index.php(140): Elastica\Client->addDocuments(Array)
#3 /srv/mediawiki/php-1.28.0-wmf.4/vendor/ruflin/elastica/lib/Elastica/Type.php(199): Elastica\Index->addDocuments(Array)
#4 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(597): Elastica\Type->addDocuments(Array)
#5 /srv/mediawiki/php-1.28.0-wmf.4/extensions/Elastica/ElasticaConnection.php(300): CirrusSearch\Maintenance\UpdateSuggesterIndex->CirrusSearch\Maintenance\{closure}()
#6 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(599): MWElasticUtils::withRetry(5, Object(Closure))
#7 /srv/mediawiki/php-1.28.0-wmf.4/extensions/Elastica/ElasticaConnection.php(256): CirrusSearch\Maintenance\UpdateSuggesterIndex->CirrusSearch\Maintenance\{closure}(Array)
#8 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(600): MWElasticUtils::iterateOverScroll(Object(Elastica\Index), 'c2NhbjsxOzE2MjY...', '15m', Object(Closure), 0, 5)
#9 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(304): CirrusSearch\Maintenance\UpdateSuggesterIndex->indexData()
#10 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(232): CirrusSearch\Maintenance\UpdateSuggesterIndex->rebuild()
#11 /srv/mediawiki/php-1.28.0-wmf.4/maintenance/doMaintenance.php(103): CirrusSearch\Maintenance\UpdateSuggesterIndex->execute()
#12 /srv/mediawiki/php-1.28.0-wmf.4/extensions/CirrusSearch/maintenance/updateSuggesterIndex.php(808): require_once('/srv/mediawiki/...')
#13 /srv/mediawiki/multiversion/MWScript.php(97): require_once('/srv/mediawiki/...')
#14 {main}