Page MenuHomePhabricator

Terbium cronjobs attempting to connect to labstestweb2001
Closed, ResolvedPublicPRODUCTION ERROR

Description

The following cron errors have been happening for months from terbium:

/usr/local/bin/foreachwiki maintenance/cleanupUploadStash.php
Wikimedia\Rdbms\DBConnectionError from line 796 of /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/database/Database.php: Cannot access the database: Access denied for user 'wikiadmin'@'10.64.32.13' (using password: YES) (208.80.153.14)
Backtrace:
#0 /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/loadbalancer/LoadBalancer.php(995): Wikimedia\Rdbms\Database->reportConnectionError(string)
#1 /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/loadbalancer/LoadBalancer.php(666): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#2 /srv/mediawiki/php-1.30.0-wmf.5/includes/GlobalFunctions.php(3054): Wikimedia\Rdbms\LoadBalancer->getConnection(integer, array, boolean)
#3 /srv/mediawiki/php-1.30.0-wmf.5/includes/filerepo/LocalRepo.php(460): wfGetDB(integer)
#4 /srv/mediawiki/php-1.30.0-wmf.5/maintenance/cleanupUploadStash.php(50): LocalRepo->getReplicaDB()
#5 /srv/mediawiki/php-1.30.0-wmf.5/maintenance/doMaintenance.php(111): UploadStashCleanup->execute()
#6 /srv/mediawiki/php-1.30.0-wmf.5/maintenance/cleanupUploadStash.php(156): require_once(string)
#7 /srv/mediawiki/multiversion/MWScript.php(99): require_once(string)
#8 {main}

And this one as well:

/usr/local/bin/foreachwikiindblist /srv/mediawiki/dblists/echo.dblist extensions/Echo/maintenance/processEchoEmailBatch.php
Wikimedia\Rdbms\DBConnectionError from line 796 of /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/database/Database.php: Cannot access the database: Access denied for user 'wikiadmin'@'10.64.32.13' (using password: YES) (208.80.153.14)
Backtrace:
#0 /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/loadbalancer/LoadBalancer.php(995): Wikimedia\Rdbms\Database->reportConnectionError(string)
#1 /srv/mediawiki/php-1.30.0-wmf.5/includes/libs/rdbms/loadbalancer/LoadBalancer.php(666): Wikimedia\Rdbms\LoadBalancer->reportConnectionError()
#2 /srv/mediawiki/php-1.30.0-wmf.5/extensions/Echo/includes/EchoDbFactory.php(123): Wikimedia\Rdbms\LoadBalancer->getConnection(integer, array, boolean)
#3 /srv/mediawiki/php-1.30.0-wmf.5/extensions/Echo/includes/EmailBatch.php(340): MWEchoDbFactory::getDB(integer)
#4 /srv/mediawiki/php-1.30.0-wmf.5/extensions/Echo/maintenance/processEchoEmailBatch.php(49): MWEchoEmailBatch::getUsersToNotify(integer, integer)
#5 /srv/mediawiki/php-1.30.0-wmf.5/maintenance/doMaintenance.php(111): ProcessEchoEmailBatch->execute()
#6 /srv/mediawiki/php-1.30.0-wmf.5/extensions/Echo/maintenance/processEchoEmailBatch.php(80): require_once(string)
#7 /srv/mediawiki/multiversion/MWScript.php(99): require_once(string)
#8 {main}

There is no such user on labstestweb2001 - why is terbium trying to connect to labstestweb2001? Is that expected?

Thanks!

PS: Feel free to add/remove tags if appropriate

Event Timeline

In general we're trying to make wikitech (and labtestwikitech) more like normal wikis... they're currently updated by the standard deployment train, and running normal up-to-date mediawiki releases.

I can't think of any reason why they wouldn't also be subject to standard purge and cleanup jobs, so probably we can go ahead and add whatever permissions and accounts are needed to make that work. That seems simpler than trying to maintain them in a separate group in mediawiki-config.

<jynus> does labstestweb2001 have echo enabled?
<andrewbogott> Looks like yes, or at least it's installed there.
<jynus> ok, so the echo.dblist is ok
<andrewbogott> yeah, and I see active notifications on labtestwikitech

This is a purely operational error, not a config error.

Should the missing user be created then?

Is there a file where grants for this are tracked? Should it be shared between labswiki and labstestwiki?

The only file where I could see it was on wikitech.sql.erb and if we add it to labstest I would suggest we add it there to, for future migrations...

Change 359149 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] wikitech.sql.erb: Add labtestwiki database

https://gerrit.wikimedia.org/r/359149

maybe we should use the core grants file? or does wikitech have other different grants from core? if yes, how should we combine those if supposedly it is not different from other core mediawiki hosts?

maybe we should use the core grants file? or does wikitech have other different grants from core? if yes, how should we combine those if supposedly it is not different from other core mediawiki hosts?

hehe I sent the patch before reading this.
But I would combine both as part of: T167973 and I would audit the grants first to avoid merging unused stuff.
So for the scope of this and until T167973 is resolved or agreed upon, I would just add it to wikitech.sql.erb for now.

We do not have to wait, is it easy to see if everthing on core covers wikitech? What about the rest- I saw some connections from horizon, how can that be managed (that is a question mainly for cloud).

We do not have to wait, is it easy to see if everthing on core covers wikitech? What about the rest- I saw some connections from horizon, how can that be managed (that is a question mainly for cloud).

Apart from oathreader user coming from horizon, the rest should be covered by the %wik% that we have in core.

Change 359149 abandoned by Marostegui:
wikitech.sql.erb: Add labtestwiki database

Reason:
Not needed: https://phabricator.wikimedia.org/T167961#3351662

https://gerrit.wikimedia.org/r/359149

Change 359149 restored by Marostegui:
wikitech.sql.erb: Add labtestwiki database

https://gerrit.wikimedia.org/r/359149

Change 359149 abandoned by Marostegui:
wikitech.sql.erb: Add labtestwiki database

Reason:
:-)

https://gerrit.wikimedia.org/r/359149

Change 359152 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] mariadb: wikitech servers to use core grants

https://gerrit.wikimedia.org/r/359152

jcrespo renamed this task from Cronjobs attempting to connect to labstestweb2001 to Terbium cronjobs attempting to connect to labstestweb2001.Jun 19 2017, 8:45 AM

I have added 2 temporary accounts to connect from terbium and wasat as the admin users.

Change 359152 merged by Marostegui:
[operations/puppet@production] mariadb: wikitech servers to use core grants

https://gerrit.wikimedia.org/r/359152

Marostegui assigned this task to jcrespo.

After Jaime added the grants manually, I have talked to Andrew and I have merged the change.
So closing this ticket now as the grants for terbium to connect to labstestweb2001 are added.

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:10 PM