Page MenuHomePhabricator

Investigate MariaDB grant issues with the btmwiki database on an-redacteddb1001
Closed, ResolvedPublic

Description

For the last two months we have had failures from sqoop relating to btmwiki.

This is somehow related to the grants on an-redacted1001

We need to fix this before the next scheduled monthly run of sqoop, which will occur on September 1st.

Related email threads:

Details

Due Date
Aug 29 2024, 11:00 PM

Event Timeline

BTullis triaged this task as High priority.Aug 7 2024, 4:39 PM
BTullis set Due Date to Aug 29 2024, 11:00 PM.

The work to set up the database was done in T368066: Prepare and check storage layer for btmwiki.

It's possible that something went wrong with the grants because at the time we were also carrying out: T365453: Bring an-redacteddb1001 into service to replace clouddb1021 and this had adversely affected some of the tooling used by Data-Persistence to manage grants etc.

The test case that I have to reproduce this error is:

sudo -u analytics bash

export PYTHONPATH=\${PYTHONPATH}:/srv/deployment/analytics/refinery/python

sqoop import -D "mapred.job.name='sqoop-mediawiki-btmwiki-T371991-btmwiki.archive'" -D mapreduce.job.queuename=production --username s53272 --password-file /user/analytics/mysql-analytics-labsdb-client-pw.txt --connect jdbc:mysql://an-redacteddb1001.eqiad.wmnet:3315/btmwiki_p?characterEncoding=UTF-8 --query '\n             select ar_id,\n                    ar_namespace,\n                    convert(ar_title using utf8mb4) ar_title,\n                    null ar_text,\n                    null ar_comment,\n                    null ar_user,\n                    null ar_user_text,\n                    convert(ar_timestamp using utf8mb4) ar_timestamp,\n                    ar_minor_edit,\n                    null ar_flags,\n                    ar_rev_id,\n                    null ar_text_id,\n                    ar_deleted,\n                    ar_len,\n                    ar_page_id,\n                    ar_parent_id,\n                    convert(ar_sha1 using utf8mb4) ar_sha1,\n                    null ar_content_model,\n                    null ar_content_format,\n                    ar_actor,\n                    ar_comment_id\n\n               from archive\n              where $CONDITIONS\n                \n        ' --target-dir /wmf/tmp/analytics/sqoop-mw/user/btullis/T371991/archive/snapshot01/wikidbbtmwiki --num-mappers 32 --as-avrodatafile --boundary-query '\n            SELECT MIN(ar_id),\n                   MAX(ar_id)\n              FROM archive\n             WHERE TRUE\n                 \n        ' --split-by ar_id --class-name archive --jar-file /tmp/sqoop-jars/2024-08-12T11:21:55/mediawiki-tables-sqoop-orm.jar --map-column-java "ar_actor=Long,ar_comment=String,ar_comment_id=Long,ar_content_format=String,ar_content_model=String,ar_deleted=Integer,ar_flags=String,ar_minor_edit=Boolean,ar_text=String,ar_user=Long,ar_user_text=String,ar_text_id=Long"

This results in an error from the sqoop output like this:

ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Access denied for user 's53272'@'%' to database 'btmwiki_p'

I logged in with that user using a mysql cli and I can see that the btmwiki_p does not show up when I issue the show databases command.

image.png (655×730 px, 75 KB)

However, a database that was created afterwards aewikimedia_p does show up.

The database exists in an-redacteddb1001:

root@an-redacteddb1001:s5[(none)]> SHOW DATABASES;
[...]
| btmwiki_p           |

But the grant on that db is missing from the labsdbuser role:

root@an-redacteddb1001:s5[(none)]> SHOW GRANTS FOR 'labsdbuser';

The same grant is present in clouddb1020:

root@clouddb1020:s5[(none)]> SHOW GRANTS FOR 'labsdbuser';
[...]
| GRANT SELECT, SHOW VIEW ON `btmwiki_p`.* TO `labsdbuser`           |

Thanks @fnegri

I have now run the following on an-redacteddb1001 on the s5 section:

oot@an-redacteddb1001:s5[(none)]> GRANT SELECT, SHOW VIEW ON `btmwiki_p`.* TO `labsdbuser`;
Query OK, 0 rows affected (0.004 sec)

root@an-redacteddb1001:s5[(none)]> flush privileges;
Query OK, 0 rows affected (0.212 sec)

I can confirm that the number of visible databases that the user can see has increased from 86 to 87 and that btmwiki_p is now shown.
I don't believe that any other databases are affected.

This should fix our issue, but it would still be good to find out where this grant is set during the process of adding a new wiki, so that we can understand why it didn't work in this case.
I'm sure that it probably happened during the work on T365453: Bring an-redacteddb1001 into service to replace clouddb1021.

We have found out where the grants are usually applied:
https://github.com/wikimedia/operations-puppet/blob/production/modules/profile/files/wmcs/db/wikireplicas/views/maintain-views.py#L469-L476

That is at the same time as the creation of the database. However, there is a conditional to skip both steps if the database already exists.

In this case, we don't know why the database might have existed but the grant didn't exist.
Given that the process has worked for subsequent new databases, I don't think it's worth exploring further.

We could potentially improve the script so that it checks for the correct labsdbuser grant as well, but I think we're unlikely to need it very much, given that this seems to have been a one-off incident, related to a hardware refresh and rename.
I'm happy to resolve the ticket.