Page MenuHomePhabricator

Exception of type Wikimedia\\Rdbms\\DBConnectionError after an API query
Closed, ResolvedPublicPRODUCTION ERROR


The request
causes this error:

    "error": {
        "code": "internal_api_error_DBConnectionError",
        "info": "[62a725b459d8e036f5a4e627] Caught exception of type Wikimedia\\Rdbms\\DBConnectionError"
    "servedby": "labweb1002"

Event Timeline

Krinkle added a subscriber: Krinkle.
[62a725b459d8e036f5a4e627] /w/api.php?...

Wikimedia\Rdbms\DBConnectionError: Cannot access the database: Unknown error (

#0 /srv/mediawiki/php-1.33.0-wmf.4/includes/libs/rdbms/loadbalancer/LoadBalancer.php(753): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#1 /srv/mediawiki/php-1.33.0-wmf.4/includes/GlobalFunctions.php(2653): Wikimedia\Rdbms\LoadBalancer->getConnection(integer, array, boolean)
#2 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiBase.php(651): wfGetDB(integer, string)
#3 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiPageSet.php(1416): ApiBase->getDB()
#4 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiPageSet.php(805): ApiPageSet->getDB()
#5 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiPageSet.php(229): ApiPageSet->initFromTitles(array)
#6 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiPageSet.php(140): ApiPageSet->executeInternal(boolean)
#7 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiQuery.php(234): ApiPageSet->execute()
#8 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiMain.php(1570): ApiQuery->execute()
#9 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiMain.php(531): ApiMain->executeAction()
#10 /srv/mediawiki/php-1.33.0-wmf.4/includes/api/ApiMain.php(502): ApiMain->executeActionWithErrorHandling()
#11 /srv/mediawiki/php-1.33.0-wmf.4/api.php(87): ApiMain->execute()
#12 /srv/mediawiki/w/api.php(3): include(string)
#13 {main}
ERROR from Wikimedia\Rdbms\DatabaseMysqlBase::open:
  Error connecting to
    Too many connections
Marostegui edited projects, added cloud-services-team (Kanban); removed DBA.
Marostegui added a subscriber: Marostegui.

This is probably another spike similar to T188589 or T209480 hitting db1073 (m5 master)

Which makes it much weirder to me?

Unless this happened only before we ended up way below the max connections.

It could have been a spike fast enough not to be captured by graphs - not the first time we have seen that happening

Current status for the record

root@MISC m5[information_schema]> select user, count(*) as count FROM information_schema.processlist GROUP BY user ORDER BY count DESC;
| user            | count |
| nova            |   174 |
| keystone        |    77 |
| neutron         |    48 |
| glance          |    35 |
| designate       |    18 |
| watchdog        |     7 |
| testreduce      |     2 |
| repl            |     2 |
| root            |     2 |
| wikiuser        |     1 |
| event_scheduler |     1 |
| wikiadmin       |     1 |
12 rows in set (0.00 sec)

This was open on Saturday, when the issue paged and we fixed it, though. Has it happened since then?

I suppose I'm asking @Krinkle if that's from logs or this just happened today as well.

Ah, I didn't realise it was from Saturday - I got confused with the update at T210332#4786844.

The update from @Krinkle has the same has than the reporter: 62a725b459d8e036f5a4e627 so I guess he was expanding the info.
I think this is fine to be closed as it is resolved.

Spiking by over 100 connections would take a serious hit that none of our tools currently are able to do.

I think this is resolved - please reopen if you think otherwise!
Thanks for reporting it!

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:08 PM