Tool weapon-of-mass-description cannot connect to tools.db.svc.eqiad.wmflabs (access denied)
Closed, ResolvedPublic

Description

Hello,

my tool weapon-of-mass-description cannot connect to tools.db.svc.eqiad.wmflabs, see following output. Connecting to database replicas works as it should. For other tools operated by myself this works correctly and as it should.

Thanks in advance for fixing,
Martin Urbanec
(Urbanecm)

tools.weapon-of-mass-description@tools-bastion-03 ~
$ ls -la .my.cnf replica.my.cnf
lrwxrwxrwx 1 tools.weapon-of-mass-description tools.weapon-of-mass-description 14 Dec 23 22:04 .my.cnf -> replica.my.cnf
-r-------- 1 tools.weapon-of-mass-description tools.weapon-of-mass-description 52 Dec 22 20:06 replica.my.cnf
tools.weapon-of-mass-description@tools-bastion-03 ~
$ sql -v local
Connecting to tools.db.svc.eqiad.wmflabs
ERROR 1045 (28000): Access denied for user 's53612'@'10.68.23.58' (using password: YES)
tools.weapon-of-mass-description@tools-bastion-03 ~
$ mysql -h tools-db
ERROR 1045 (28000): Access denied for user 's53612'@'10.68.23.58' (using password: YES)
tools.weapon-of-mass-description@tools-bastion-03 ~
$ sql enwiki
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

seleWelcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 24713711
Server version: 10.1.29-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [enwiki_p]> select page_id, page_title from page limit 10;
+----------+------------------------------+
| page_id  | page_title                   |
+----------+------------------------------+
|  5878274 | !                            |
|  3632887 | !!                           |
|   600744 | !!!                          |
| 34443176 | !!!Fuck_You!!!               |
| 11011780 | !!!Fuck_You!!!_And_Then_Some |
| 34443184 | !!!Fuck_You!!!_and_Then_Some |
| 39401265 | !!!_(Chk_Chk_Chk)            |
|  2556962 | !!!_(album)                  |
| 10065458 | !!!_(band)                   |
| 55029148 | !!!_(disambiguation)         |
+----------+------------------------------+
10 rows in set (0.00 sec)

MariaDB [enwiki_p]> Bye
tools.weapon-of-mass-description@tools-bastion-03 ~
$
Urbanecm created this task.Dec 23 2017, 9:08 PM
Restricted Application added a project: User-Urbanecm. · View Herald TranscriptDec 23 2017, 9:08 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Urbanecm triaged this task as High priority.Dec 23 2017, 9:08 PM
Urbanecm updated the task description. (Show Details)
Urbanecm updated the task description. (Show Details)Dec 23 2017, 9:13 PM
Urbanecm edited projects, added Data-Services; removed Cloud-Services.
Urbanecm added a project: Toolforge.
bd808 closed this task as Resolved.Dec 24 2017, 12:42 AM
bd808 claimed this task.
bd808 added a subscriber: bd808.

I used the instructions at https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin#Debugging_bad_mysql_credentials to lookup the status for this account. The tracking labsdbaccounts database confirmed that the user was missing credentials on labsdb1005.eqiad.wmnet. I then used the labsdbadmin credentials to connect to labsdb1005.eqiad.wmnet and manually created the missing user and grants:

CREATE USER 's53612'@'%' IDENTIFIED BY PASSWORD 'HASH_FROM_labsdbaccounts.account';
GRANT SELECT, SHOW VIEW ON `%\_p`.* TO 's53612'@'%';
GRANT ALL PRIVILEGES ON `s53612\_\_%`.* TO 's53612'@'%';

The credentials now seem to work:

$ sudo become weapon-of-mass-description
$ sql local
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 31209919
Server version: 10.0.33-MariaDB MariaDB Server

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> show databases;
+---------------------------------------+
| Database                              |
+---------------------------------------+
| information_schema                    |
...
| u2815__p                              |
+---------------------------------------+
72 rows in set (0.06 sec)

MariaDB [(none)]>

Dne ne 24. pro 2017 1:43 uživatel bd808 <no-reply@phabricator.wikimedia.org>
napsal:

bd808 closed this task as "Resolved".
bd808 claimed this task.
bd808 added a comment.

I used the instructions at
https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin#Debugging_bad_mysql_credentials
to lookup the status for this account. The tracking labsdbaccounts
database confirmed that the user was missing credentials on
labsdb1005.eqiad.wmnet. I then used the labsdbadmin credentials to
connect to labsdb1005.eqiad.wmnet and manually created the missing user and
grants:

CREATE USER 's53612'@'%' IDENTIFIED BY PASSWORD 'HASH_FROM_labsdbaccounts.account';
GRANT SELECT, SHOW VIEW ON %\_p.* TO 's53612'@'%';
GRANT ALL PRIVILEGES ON s53612\_\_%.* TO 's53612'@'%';

The credentials now seem to work:

$ sudo become weapon-of-mass-description$ sql localWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 31209919Server version: 10.0.33-MariaDB MariaDB ServerCopyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> show databases;+---------------------------------------+| Database |+---------------------------------------+| information_schema |...| u2815__p |+---------------------------------------+72 rows in set (0.06 sec)MariaDB [(none)]>

*TASK DETAIL*
https://phabricator.wikimedia.org/T183644

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *bd808
*Cc: *bd808, Urbanecm, Aklapper, 1978Gage2001, aborrero, Chicocvenancio,
Tbscho, Freddy2001, JJMC89, TerraCodes, Luke081515, Gryllida, jayvdb, scfc,
coren, Krenair, chasemp

Thanks a lot and sorry for my previous comment.

Dne ne 24. pro 2017 1:43 uživatel bd808 <no-reply@phabricator.wikimedia.org>
napsal:

bd808 closed this task as "Resolved".
bd808 claimed this task.
bd808 added a comment.

I used the instructions at
https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin#Debugging_bad_mysql_credentials
to lookup the status for this account. The tracking labsdbaccounts
database confirmed that the user was missing credentials on
labsdb1005.eqiad.wmnet. I then used the labsdbadmin credentials to
connect to labsdb1005.eqiad.wmnet and manually created the missing user and
grants:

CREATE USER 's53612'@'%' IDENTIFIED BY PASSWORD 'HASH_FROM_labsdbaccounts.account';
GRANT SELECT, SHOW VIEW ON %\_p.* TO 's53612'@'%';
GRANT ALL PRIVILEGES ON s53612\_\_%.* TO 's53612'@'%';

The credentials now seem to work:

$ sudo become weapon-of-mass-description$ sql localWelcome to the MariaDB monitor. Commands end with ; or \g.Your MariaDB connection id is 31209919Server version: 10.0.33-MariaDB MariaDB ServerCopyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.MariaDB [(none)]> show databases;+---------------------------------------+| Database |+---------------------------------------+| information_schema |...| u2815__p |+---------------------------------------+72 rows in set (0.06 sec)MariaDB [(none)]>

*TASK DETAIL*
https://phabricator.wikimedia.org/T183644

*EMAIL PREFERENCES*
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

*To: *bd808
*Cc: *bd808, Urbanecm, Aklapper, 1978Gage2001, aborrero, Chicocvenancio,
Tbscho, Freddy2001, JJMC89, TerraCodes, Luke081515, Gryllida, jayvdb, scfc,
coren, Krenair, chasemp

bd808 added a comment.Dec 24 2017, 1:03 AM

There are a number of additional accounts which are missing from labsdb1005:

(labsdbaccounts@m5-master.eqiad.wmnet) [labsdbaccounts]> select account_id, hostname from account_host where status = 'absent';
+------------+------------------------+
| account_id | hostname               |
+------------+------------------------+
|      18337 | labsdb1005.eqiad.wmnet |
|      18639 | labsdb1005.eqiad.wmnet |
|      18640 | labsdb1005.eqiad.wmnet |
|      18641 | labsdb1005.eqiad.wmnet |
|      18642 | labsdb1005.eqiad.wmnet |
|      18643 | labsdb1005.eqiad.wmnet |
|      18644 | labsdb1005.eqiad.wmnet |
|      18645 | labsdb1005.eqiad.wmnet |
|      18646 | labsdb1005.eqiad.wmnet |
+------------+------------------------+
9 rows in set (0.02 sec)
$ sudo journalctl -u maintain-dbusers.service -n 100 -a --no-pager|grep -v Skipping
-- Logs begin at Fri 2017-12-15 16:22:09 UTC, end at Sun 2017-12-24 00:54:41 UTC. --
Dec 24 00:54:35 labstore1004 systemd[1]: maintain-dbusers.service: main process exited, code=exited, status=1/FAILURE
Dec 24 00:54:35 labstore1004 systemd[1]: Unit maintain-dbusers.service entered failed state.
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: Traceback (most recent call last):
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/local/sbin/maintain-dbusers", line 565, in <module>
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: create_accounts(config)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/local/sbin/maintain-dbusers", line 403, in create_accounts
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: max_connections=10,
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 167, in execute
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: result = self._query(query)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/cursors.py", line 323, in _query
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: conn.query(q)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 836, in query
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: self._affected_rows = self._read_query_result(unbuffered=unbuffered)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1020, in _read_query_result
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: result.read()
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 1303, in read
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: first_packet = self.connection._read_packet()
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 982, in _read_packet
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: packet.check_error()
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/connections.py", line 394, in check_error
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: err.raise_mysql_exception(self._data)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/err.py", line 120, in raise_mysql_exception
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: _check_mysql_exception(errinfo)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: File "/usr/lib/python3/dist-packages/pymysql/err.py", line 115, in _check_mysql_exception
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: raise InternalError(errno, errorvalue)
Dec 24 00:54:35 labstore1004 maintain-dbusers[39529]: pymysql.err.InternalError: (1396, "Operation CREATE USER failed for 'u2580'@'%'")
Dec 24 00:54:35 labstore1004 systemd[1]: maintain-dbusers.service holdoff time over, scheduling restart.
Dec 24 00:54:35 labstore1004 systemd[1]: Stopping Maintain labsdb accounts...
Dec 24 00:54:35 labstore1004 systemd[1]: Starting Maintain labsdb accounts...
Dec 24 00:54:35 labstore1004 systemd[1]: Started Maintain labsdb accounts.
Dec 24 00:54:37 labstore1004 maintain-dbusers[39536]: Found 0 new tool accounts:

The u2580 record is causing problems. This is a pretty low user id, so this it likely the problem that we have seen before where there is an existing user record for this usr that is not tracked in our tracking database. This can happen when a long standing LDAP account is made a Toolforge member (and needs to be fixed in our credentials creation script).

bd808 added a comment.Dec 24 2017, 1:26 AM

The manual account creation I did wasn't really the right way to fix this. The right way is to delete the untracked credentials that are blocking things from getting created properly and then let the maintain-dbusers script recreate them properly.

This requires doing something like DROP USER 'bad_user'@'%'; on labsdb1005.eqiad.wmnet. I did that for the u2580 account and then again for the s53612 account which I had manually created.

The state of absent accounts according to the tracking db after that was:

(labsdbaccounts@m5-master.eqiad.wmnet) [labsdbaccounts]> SELECT a.mysql_username, h.hostname from account a JOIN account_host h ON a.id=h.account_id where h.status = 'absent';
+----------------+------------------------+
| mysql_username | hostname               |
+----------------+------------------------+
| u1148          | labsdb1005.eqiad.wmnet |
| s53613         | labsdb1005.eqiad.wmnet |
| u18184         | labsdb1005.eqiad.wmnet |
| s53614         | labsdb1005.eqiad.wmnet |
+----------------+------------------------+
4 rows in set (0.03 sec)

The u1148 account is now blocking credential creation. This is another case of the "old user, new to toolforge" problem. I removed the existing account for this user on labsdb1005.eqiad.wmnet as well. This allowed maintain-dbusers to complete processing all pending accounts:

(labsdbaccounts@m5-master.eqiad.wmnet) [labsdbaccounts]> SELECT a.mysql_username, h.hostname from account a JOIN account_host h ON a.id=h.account_id where h.status = 'absent';
Empty set (0.01 sec)