Page MenuHomePhabricator

Spike of fatal timeouts from API search suggestions (prefixsearch)
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error (1)

Request ID: XBvK6gpAMEMAABshWO8AAADS
Request URL: enwiki /w/api.php?action=query&list=search&format=json& .. srsearch= .. &srlimit=5

trace
PHP Fatal Error: entire web request took longer than 60 seconds and timed out

#0 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileService.php(345): NO_FUNCTION_GIVEN()
#1 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileServiceFactory.php(178): CirrusSearch\Profile\SearchProfileService->registerConfigOverride(string, string, CirrusSearch\HashSearchConfig, string)
#2 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileServiceFactory.php(112): CirrusSearch\Profile\SearchProfileServiceFactory->loadRescoreProfiles(CirrusSearch\Profile\SearchProfileService, CirrusSearch\HashSearchConfig)
#3 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/SearchConfig.php(311): CirrusSearch\Profile\SearchProfileServiceFactory->loadService(CirrusSearch\HashSearchConfig)
#4 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(277): CirrusSearch\SearchConfig->getProfileService()
#5 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(339): CirrusSearch\Searcher->buildFullTextSearch(string)
#6 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(172): CirrusSearch\Searcher->searchTextInternal(string)
#7 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/LangDetectFallbackMethod.php(152): CirrusSearch\Searcher->search(CirrusSearch\Search\SearchQuery)
#8 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/FallbackRunner.php(79): CirrusSearch\Fallbacks\LangDetectFallbackMethod->rewrite(CirrusSearch\Search\ResultSet, CirrusSearch\Search\ResultSet)
#9 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/FallbackRunner.php(67): CirrusSearch\Fallbacks\FallbackRunner->execute(CirrusSearch\Fallbacks\LangDetectFallbackMethod, CirrusSearch\Search\ResultSet, CirrusSearch\Search\ResultSet)
#10 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(175): CirrusSearch\Fallbacks\FallbackRunner->run(CirrusSearch\Search\ResultSet)
#11 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/CirrusSearch.php(215): CirrusSearch\Searcher->search(CirrusSearch\Search\SearchQuery)
#12 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/CirrusSearch.php(188): CirrusSearch->searchTextReal(CirrusSearch\Search\SearchQuery)
#13 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(83): CirrusSearch->doSearchText(string)
#14 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(169): Closure$SearchEngine::searchText()
#15 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(84): SearchEngine->maybePaginate(Closure$SearchEngine::searchText;2942)
#16 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuerySearch.php(107): SearchEngine->searchText(string)
#17 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuerySearch.php(41): ApiQuerySearch->run()
#18 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuery.php(249): ApiQuerySearch->execute()
#19 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiMain.php(1595): ApiQuery->execute()

Error (2)

  • Request ID: XBvLBApAMEMAABshWbIAAADA
  • Request URL de.wikipedia.org /w/api.php?format=json&formatversion=2&action=query& .. prop=description|pageimages&piprop=thumbnail& .. generator=prefixsearch& .. gpssearch= .. &gpslimit=20&srsearch= ..
trace
PHP Fatal Error: entire web request took longer than 60 seconds and timed out

#0 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileService.php(345): NO_FUNCTION_GIVEN()
#1 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileServiceFactory.php(178): CirrusSearch\Profile\SearchProfileService->registerConfigOverride(string, string, CirrusSearch\HashSearchConfig, string)
#2 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Profile/SearchProfileServiceFactory.php(112): CirrusSearch\Profile\SearchProfileServiceFactory->loadRescoreProfiles(CirrusSearch\Profile\SearchProfileService, CirrusSearch\HashSearchConfig)
#3 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/SearchConfig.php(311): CirrusSearch\Profile\SearchProfileServiceFactory->loadService(CirrusSearch\HashSearchConfig)
#4 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(277): CirrusSearch\SearchConfig->getProfileService()
#5 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(339): CirrusSearch\Searcher->buildFullTextSearch(string)
#6 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(172): CirrusSearch\Searcher->searchTextInternal(string)
#7 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/LangDetectFallbackMethod.php(152): CirrusSearch\Searcher->search(CirrusSearch\Search\SearchQuery)
#8 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/FallbackRunner.php(79): CirrusSearch\Fallbacks\LangDetectFallbackMethod->rewrite(CirrusSearch\Search\ResultSet, CirrusSearch\Search\ResultSet)
#9 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Fallbacks/FallbackRunner.php(67): CirrusSearch\Fallbacks\FallbackRunner->execute(CirrusSearch\Fallbacks\LangDetectFallbackMethod, CirrusSearch\Search\ResultSet, CirrusSearch\Search\ResultSet)
#10 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/Searcher.php(175): CirrusSearch\Fallbacks\FallbackRunner->run(CirrusSearch\Search\ResultSet)
#11 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/CirrusSearch.php(215): CirrusSearch\Searcher->search(CirrusSearch\Search\SearchQuery)
#12 /srv/mediawiki/php-1.33.0-wmf.9/extensions/CirrusSearch/includes/CirrusSearch.php(188): CirrusSearch->searchTextReal(CirrusSearch\Search\SearchQuery)
#13 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(83): CirrusSearch->doSearchText(string)
#14 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(169): Closure$SearchEngine::searchText()
#15 /srv/mediawiki/php-1.33.0-wmf.9/includes/search/SearchEngine.php(84): SearchEngine->maybePaginate(Closure$SearchEngine::searchText;2942)
#16 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuerySearch.php(85): SearchEngine->searchText(string)
#17 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuerySearch.php(41): ApiQuerySearch->run()
#18 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiQuery.php(249): ApiQuerySearch->execute()
#19 /srv/mediawiki/php-1.33.0-wmf.9/includes/api/ApiMain.php(1595): ApiQuery->execute()

Impact

A number of user queriees stalled for 60 seconds and then received a generic error page without any results.

Notes

Today around 17:05 UTC there was a spike of 350 fatal errors as result of the 60 seconds execution timeout being reached.

This is highly unusual for API requests that query for search suggestions which is generally very quick and doesn't involve (afaik) slow DB queries or CPU intensive operations (like wikitext parsing).

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 481101 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] Language detection should not be run when rewriting is not enabled

https://gerrit.wikimedia.org/r/481101

There are actually 2 search queries in these API requests:

  • generator=prefixsearch gpslimit=20 for autocomplete
  • list=search srlimit=1 to fetch the "Did you mean suggestion"

This technique is used by the mobile apps to run a single API request to fetch both autocomplete suggestions + "Did you mean suggestion". The later sends a fulltext query to the backend. The patch (481101) fixes a regression but should not explain "alone" the 60 secs. I doubt that SearchProfileService.php is the culprit unless the server is struggling on mem. Will continue to investigate.

dcausse triaged this task as Medium priority.Dec 21 2018, 7:46 AM
dcausse moved this task from needs triage to Current work on the Discovery-Search board.

Looking at more instances of timeouts raised in the cirrus stack it appears to be rather random.
I believe that this regression exacerbated T204871 as the spike happened during the rollout of 1.33-wmf.9:

Timeouts caused by Cirrus during the rollout

PHP_TIMEOUT_CIRRUS.png (279×1 px, 29 KB)

All timeouts:
PHP_TIMEOUT_NOT_CIRRUS.png (306×1 px, 32 KB)

Cirrus seems to have been the cause of a lot of these timeouts during this deployment. The regression fixed in 481101 caused some CPU intensive code (TextCat) to be run on all autocomplete queries issued from the mobile apps. Another possible reason is that when the language detector worked we may have run a fast but blocking request to the MW API of a another wiki. I'm not sure that we fix timeouts on such internal API requests, I'll adapt this and tentatively close this ticket.

Change 481145 had a related patch set uploaded (by DCausse; owner: DCausse):
[mediawiki/extensions/CirrusSearch@master] InterwikiResolver: set MultiHttpClient timeouts

https://gerrit.wikimedia.org/r/481145

Change 481145 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] InterwikiResolver: set MultiHttpClient timeouts

https://gerrit.wikimedia.org/r/481145

Change 481101 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Language detection should not be run when rewriting is not enabled

https://gerrit.wikimedia.org/r/481101

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM