Page MenuHomePhabricator

SqlBlobStore no longer caching blobs (DBConnectionError Too many connections)
Closed, ResolvedPublicPRODUCTION ERROR

Description

Error
  • mwversion: 1.37.0-wmf.1
  • reqId: 93b8ccbd-f566-4569-94f4-7aa3266003cb
normalized_message
[{reqId}] {exception_url}   Wikimedia\Rdbms\DBConnectionError: Cannot access the database: Too many connections (10.64.32.102) (10.64.32.102)
exception.trace
from /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1510)
#0 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/rdbms/loadbalancer/LoadBalancer.php(997): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#1 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/rdbms/loadbalancer/LoadBalancer.php(962): Wikimedia\Rdbms\LoadBalancer->getServerConnection(integer, string, integer)
#2 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/rdbms/loadbalancer/LoadBalancer.php(1101): Wikimedia\Rdbms\LoadBalancer->getConnection(integer, array, string, integer)
#3 /srv/mediawiki/php-1.37.0-wmf.1/includes/externalstore/ExternalStoreDB.php(168): Wikimedia\Rdbms\LoadBalancer->getConnectionRef(integer, array, string, integer)
#4 /srv/mediawiki/php-1.37.0-wmf.1/includes/externalstore/ExternalStoreDB.php(312): ExternalStoreDB->getReplica(string)
#5 /srv/mediawiki/php-1.37.0-wmf.1/includes/externalstore/ExternalStoreDB.php(66): ExternalStoreDB->fetchBlob(string, string, boolean)
#6 /srv/mediawiki/php-1.37.0-wmf.1/includes/externalstore/ExternalStoreAccess.php(52): ExternalStoreDB->fetchFromURL(string)
#7 /srv/mediawiki/php-1.37.0-wmf.1/includes/Storage/SqlBlobStore.php(509): ExternalStoreAccess->fetchFromURL(string, array)
#8 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/objectcache/wancache/WANObjectCache.php(1714): MediaWiki\Storage\SqlBlobStore->MediaWiki\Storage\{closure}(boolean, integer, array, NULL, array)
#9 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/objectcache/wancache/WANObjectCache.php(1542): WANObjectCache->fetchOrRegenerate(string, integer, Closure, array, array)
#10 /srv/mediawiki/php-1.37.0-wmf.1/includes/Storage/SqlBlobStore.php(513): WANObjectCache->getWithSetCallback(string, integer, Closure, array)
#11 /srv/mediawiki/php-1.37.0-wmf.1/includes/Storage/SqlBlobStore.php(430): MediaWiki\Storage\SqlBlobStore->expandBlob(string, array, string)
#12 /srv/mediawiki/php-1.37.0-wmf.1/includes/Storage/SqlBlobStore.php(286): MediaWiki\Storage\SqlBlobStore->fetchBlobs(array, integer)
#13 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/objectcache/wancache/WANObjectCache.php(1714): MediaWiki\Storage\SqlBlobStore->MediaWiki\Storage\{closure}(boolean, integer, array, NULL, array)
#14 /srv/mediawiki/php-1.37.0-wmf.1/includes/libs/objectcache/wancache/WANObjectCache.php(1542): WANObjectCache->fetchOrRegenerate(string, integer, Closure, array, array)
#15 /srv/mediawiki/php-1.37.0-wmf.1/includes/Storage/SqlBlobStore.php(291): WANObjectCache->getWithSetCallback(string, integer, Closure, array)
#16 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RevisionStore.php(1191): MediaWiki\Storage\SqlBlobStore->getBlob(string, integer)
#17 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RevisionStore.php(1463): MediaWiki\Revision\RevisionStore->loadSlotContent(MediaWiki\Revision\SlotRecord, NULL, NULL, NULL, integer)
#18 [internal function]: MediaWiki\Revision\RevisionStore->MediaWiki\Revision\{closure}(MediaWiki\Revision\SlotRecord)
#19 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/SlotRecord.php(324): call_user_func(Closure, MediaWiki\Revision\SlotRecord)
#20 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RevisionRecord.php(164): MediaWiki\Revision\SlotRecord->getContent()
#21 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(3697): MediaWiki\Revision\RevisionRecord->getContent(string)
#22 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(3547): Parser->statelessFetchTemplate(Title, Parser)
#23 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(3415): Parser->fetchTemplateAndTitle(Title)
#24 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(3157): Parser->getTemplateDom(Title)
#25 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/PPFrame_Hash.php(263): Parser->braceSubstitution(array, PPFrame_Hash)
#26 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(2879): PPFrame_Hash->expand(PPNode_Hash_Tree, integer)
#27 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(1549): Parser->replaceVariables(string)
#28 /srv/mediawiki/php-1.37.0-wmf.1/includes/parser/Parser.php(639): Parser->internalParse(string)
#29 /srv/mediawiki/php-1.37.0-wmf.1/includes/content/WikitextContent.php(375): Parser->parse(string, Title, ParserOptions, boolean, boolean, integer)
#30 /srv/mediawiki/php-1.37.0-wmf.1/includes/content/AbstractContent.php(591): WikitextContent->fillParserOutput(Title, integer, ParserOptions, boolean, ParserOutput)
#31 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RenderedRevision.php(266): AbstractContent->getParserOutput(Title, integer, ParserOptions, boolean)
#32 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RenderedRevision.php(235): MediaWiki\Revision\RenderedRevision->getSlotParserOutputUncached(WikitextContent, boolean)
#33 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RevisionRenderer.php(217): MediaWiki\Revision\RenderedRevision->getSlotParserOutput(string, array)
#34 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RevisionRenderer.php(154): MediaWiki\Revision\RevisionRenderer->combineSlotOutput(MediaWiki\Revision\RenderedRevision, array)
#35 [internal function]: MediaWiki\Revision\RevisionRenderer->MediaWiki\Revision\{closure}(MediaWiki\Revision\RenderedRevision, array)
#36 /srv/mediawiki/php-1.37.0-wmf.1/includes/Revision/RenderedRevision.php(197): call_user_func(Closure, MediaWiki\Revision\RenderedRevision, array)
#37 /srv/mediawiki/php-1.37.0-wmf.1/includes/poolcounter/PoolWorkArticleView.php(137): MediaWiki\Revision\RenderedRevision->getRevisionParserOutput()
#38 /srv/mediawiki/php-1.37.0-wmf.1/includes/poolcounter/PoolCounterWork.php(162): PoolWorkArticleView->doWork()
#39 /srv/mediawiki/php-1.37.0-wmf.1/includes/page/ParserOutputAccess.php(281): PoolCounterWork->execute()
#40 /srv/mediawiki/php-1.37.0-wmf.1/includes/page/Article.php(749): MediaWiki\Page\ParserOutputAccess->getParserOutput(WikiPage, ParserOptions, MediaWiki\Revision\RevisionStoreCacheRecord, integer)
#41 /srv/mediawiki/php-1.37.0-wmf.1/includes/page/Article.php(561): Article->generateContentOutput(User, ParserOptions, integer, OutputPage, array)
#42 /srv/mediawiki/php-1.37.0-wmf.1/includes/actions/ViewAction.php(74): Article->view()
#43 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(535): ViewAction->show()
#44 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(319): MediaWiki->performAction(Article, Title)
#45 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(916): MediaWiki->performRequest()
#46 /srv/mediawiki/php-1.37.0-wmf.1/includes/MediaWiki.php(550): MediaWiki->main()
#47 /srv/mediawiki/php-1.37.0-wmf.1/index.php(53): MediaWiki->run()
#48 /srv/mediawiki/php-1.37.0-wmf.1/index.php(46): wfIndexMain()
#49 /srv/mediawiki/w/index.php(3): require(string)
#50 {main}
Impact

Low. Not a train blocker at this level, but it looks quite worrtying. Occurs on a variety of wikis, including wikidata.

Notes

Not sure which part of MW is responsible, so my tags may be a little random. Please fix if so. Sorry.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Urbanecm added a subscriber: Urbanecm.

Tagging DBA, as they might be able to offer some guidance on finding the issue here.

jcrespo triaged this task as Unbreak Now! priority.Thu, Apr 29, 3:45 PM
jcrespo added a subscriber: jcrespo.

This should be a blocker- es traffic has grown almost grown 100x since 14 april, correlates strongly with the 19h deploy:

ACK, I'll make it a train blocker.

Given we only make requests to external storage when parsercache has a miss, it seemed sensible to look for corresponding patterns in parsercache.

I see we introduced a new category of misses on the same date "miss_absent_metadata", see https://grafana-rw.wikimedia.org/d/000000106/parser-cache?viewPanel=7&orgId=1&from=now-30d&to=now which seems related.

Given we only make requests to external storage when parsercache has a miss, it seemed sensible to look for corresponding patterns in parsercache.

I see we introduced a new category of misses on the same date "miss_absent_metadata", see https://grafana-rw.wikimedia.org/d/000000106/parser-cache?viewPanel=7&orgId=1&from=now-30d&to=now which seems related.

that's probably a red herring, it's a bugfix Daniel Kinzler made in https://gerrit.wikimedia.org/r/c/mediawiki/core/+/677346 and looks like a bugfix for the counter.

A better candidate for changing something is probably https://gerrit.wikimedia.org/r/c/mediawiki/core/+/677299. @Pchelolo is looking into it.

A better candidate for changing something is probably https://gerrit.wikimedia.org/r/c/mediawiki/core/+/677299. @Pchelolo is looking into it.

This one is a partial revert of a previously added optimization that was not needed, and is fixing PoolCounter - before PoolCounter couldn't fetch the parsed content after waiting for a lock. But we can try reverting it if there's no other guesses

There is definitely something going very wrong with memcached:

https://grafana.wikimedia.org/d/000000316/memcache?viewPanel=60&orgId=1&from=now-30d&to=now

shows misses increasing across the board

Krinkle renamed this task from Cannot access the database: Too many connections to SqlBlobStore no longer caching blobs (DBConnectionError Too many connections).Thu, Apr 29, 5:38 PM
Krinkle edited projects, added MediaWiki-Cache; removed Wikidata, wdwb-tech.

Change 683692 had a related patch set uploaded (by Krinkle; author: Aaron Schulz):

[mediawiki/core@master] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683692

Change 683629 had a related patch set uploaded (by Krinkle; author: Aaron Schulz):

[mediawiki/core@wmf/1.37.0-wmf.3] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683629

Change 683630 had a related patch set uploaded (by Krinkle; author: Aaron Schulz):

[mediawiki/core@wmf/1.37.0-wmf.1] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683630

Change 683629 merged by jenkins-bot:

[mediawiki/core@wmf/1.37.0-wmf.3] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683629

Change 683692 merged by jenkins-bot:

[mediawiki/core@master] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683692

Addshore added a project: wdwb-tech.
Addshore removed a project: wdwb-tech.

Mentioned in SAL (#wikimedia-operations) [2021-04-29T18:10:57Z] <krinkle@deploy1002> Synchronized php-1.37.0-wmf.3/includes/libs/objectcache/MemcachedBagOStuff.php: I926797a9d494a31, T281480 (duration: 01m 09s)

Change 683630 merged by jenkins-bot:

[mediawiki/core@wmf/1.37.0-wmf.1] objectcache: set ATTR_DURABILITY in MemcachedBagOStuff

https://gerrit.wikimedia.org/r/683630

Mentioned in SAL (#wikimedia-operations) [2021-04-29T18:38:23Z] <krinkle@deploy1002> Synchronized php-1.37.0-wmf.1/includes/libs/objectcache/MemcachedBagOStuff.php: I926797a9d494a31, T281480 (duration: 01m 08s)

Krinkle assigned this task to aaron.
Krinkle moved this task from Untriaged to libs/objectcache on the MediaWiki-Cache board.
Krinkle edited projects, added Performance-Team; removed Platform Engineering.