Page MenuHomePhabricator

Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts)
Closed, ResolvedPublic

Description

Hello @bd808

We saw this commit from you a few days ago: https://gerrit.wikimedia.org/r/#/c/360564/

Move ukwikimedia to deleted.dblist

This wiki has been hosted offsite for years, but was apparently never
placed in deleted.dblist. The site is actually redirected to
wikimedia.org.uk at the Apache redirects.conf config that is managed in
operations/puppet.git.

We have a process that checks for data that isn't supposed to be on labsdb servers (https://github.com/wikimedia/puppet/blob/8d9c58025be6b41211ec4c7ee7c590dbcc078a95/modules/role/files/mariadb/check_private_data.py) and we got alerted that those are still there and it reported:

-- Non-public databases that are present:
DROP DATABASE IF EXISTS `ukwikimedia`;
DROP DATABASE IF EXISTS `ukwikimedia_p`;

That is because of your commit, so we were wondering how to proceed with it. Shall we delete its view and the database from labs hosts?

Event Timeline

Framawiki renamed this task from ukwikimedia still present on labs hosts to ukwikimedia still present on replicas dbs on labs hosts.Jul 3 2017, 9:31 AM

@Marostegui I think it would be fine to drop the replica db copies of that database. As mentioned in the commit message when added to deleted.dblist the ukwikimedia has been an external redirect since at least mid-2014.

jcrespo moved this task from Triage to Pending comment on the DBA board.

Thanks @bd808!
Can you clean up the views and I will take care of removing the db?

jcrespo renamed this task from ukwikimedia still present on replicas dbs on labs hosts to Drop ukwikimedia from labsdb hosts (was: ukwikimedia still present on replicas dbs on labs hosts).Jul 3 2017, 4:31 PM

Yes, we may need your help to update the meta database, maybe? I can take care of the actual data deletion.

Self-reminder, reload the replication filters, too, just in case.

bah. !log fail: [16:55] < bd808> !log Running maintain-views --all-databases --clean --replace-all --debug on labsdb1001

bah. !log fail: [16:55] < bd808> !log Running maintain-views --all-databases --clean --replace-all --debug on labsdb1001

This took right at 3 hours to run and seemed to add quite a large number of missing views. It did not however seem to change anything about the ukwikimedia_p database that I can see. Limiting to ukwikimedia does not work:

$ sudo maintain-views --databases ukwikimedia --clean --replace-all --debug
2017-07-03 20:27:45,332 DEBUG Removing 1 dbs as sensitive
2017-07-03 20:27:45,332 ERROR None of the specified dbs are allowed

This is because ukwikimedia is now in the deleted.dblist (which was the reason for this ticket). Is dropping an old db a common enough action that we should add a special mode for it in maintain-views? Or would it be sufficient and reasonable to just document on wikitech somewhere the manual steps needed to cleanup after a wiki is decommissioned?

seemed to add quite a large number of missing views

Not necessarily true. I used --replace-all.

The reason I suggested to use the script because it was pointed out to me that it could be easier just using it than doing it manually. If the script still needs some tweaking that is totally fine, we can just drop them in sanitarium and the labs hosts manually, it is not a big deal :)

Claiming for the actual data drop bits- will restart all sanitarium hosts's instances, which will create some temporary replication lag.

Actually, no reloading was needed- the replication filters doesn't change because we do not include deleted wikis as part of "private" ones list (maybe we should, to avoid replication problems), so no changes on replicas. This is not a privacy problem because not only the alarms will go off again, in case of writes being enabled again on that host, replication would break immediately.

Note labsdb1003 has some replication lag, so it may take a while to take effect there.

This was done at the same time than T169661.

Dropped everywhere on labs- only waiting for replication to catch up on labsdb1003, and checking with you if meta_p database needs updating or anything else (I think you had some ongoing conversation).

jcrespo triaged this task as Medium priority.Jul 4 2017, 5:13 PM
jcrespo moved this task from Pending comment to Done on the DBA board.

Yes, labsdb1003 is delayed due to a big alter table going on, which I think will be finished in the next 24h more or less.
Thanks for taking care of this Jaime!

The reason I suggested to use the script because it was pointed out to me that it could be easier just using it than doing it manually. If the script still needs some tweaking that is totally fine, we can just drop them in sanitarium and the labs hosts manually, it is not a big deal :)

Thanks @Marostegui, sorry that didn't work out. It seems like we only thought about cleaning up specific table views and not entire DB's at the time. Should we rethink that? How common is dropping a wiki DB?

It is not super common, but if a wiki is moved to deleted.dblist, then our check_private_data script will complain and we will need to delete it. I don't think this is too common though, and as I said, it is not a big deal to delete it with the script, we can do it manually.
Eventually, it would be nice to get it integrated in the script just for the sake of having it. But I don't think it should be a priority :-)

I think the only pending thing from this task is what Jaime asked about meta_p database.

bd808 assigned this task to jcrespo.
bd808 edited projects, added Data-Services; removed Cloud-Services.

I ran sudo /usr/local/sbin/maintain-meta_p --all-databases --debug on:

  • labsdb1001
  • labsdb1003
  • labsdb1009
  • labsdb1010
  • labsdb1011

Our sanitarium host (db1069) got replication broken with:

Error 'Table 'ukwikimedia.site_stats' doesn't exist' on query. Default database: 'ukwikimedia'. Query: 'UPDATE /* SiteStatsUpdate::cacheUpdate www-data@terbiu... */  `site_stats` SET ss_active_users = '1' WHERE ss_row_id = '1''

I have skipped it, but if that wiki was moved to delete, how is it getting writes?

CC @bd808 ^ Maybe maintenance was not updated, but something else maybe wrong there.

Change 363640 had a related patch set uploaded (by BryanDavis; owner: Bryan Davis):
[operations/mediawiki-config@master] Remove ukwikimedia from config

https://gerrit.wikimedia.org/r/363640

Our sanitarium host (db1069) got replication broken with:

Error 'Table 'ukwikimedia.site_stats' doesn't exist' on query. Default database: 'ukwikimedia'. Query: 'UPDATE /* SiteStatsUpdate::cacheUpdate www-data@terbiu... */  `site_stats` SET ss_active_users = '1' WHERE ss_row_id = '1''

I have skipped it, but if that wiki was moved to delete, how is it getting writes?

CC @bd808 ^ Maybe maintenance was not updated, but something else maybe wrong there.

When I added ukwikimedia to deleted.dblist I didn't remove it from all of the other dblist files and the rest of wm-config. https://gerrit.wikimedia.org/r/#/c/363640/ should take care of things so that no other maintenance scripts running from cron touch it. Sorry for not doing this properly the first time around.

Change 363640 merged by jenkins-bot:
[operations/mediawiki-config@master] Remove ukwikimedia from config

https://gerrit.wikimedia.org/r/363640

bd808 moved this task from Inbox to Done on the cloud-services-team (Kanban) board.

The config change was synced out. ukwikimedia is not longer in any of the dblist files other than deleted.dblist. This should keep any maintenance crons or manual foreachwiki runs from touching it.

Hi,

The script that checks for data not supposed to be on labs hosts reported that ukwikimedia_p view is still present on labs hosts.
I checked and it is indeed present on all the hosts:

  • labsdb1001
  • labsdb1003
  • labsdb1009
  • labsdb1010
  • labsdb1011

The database itself doesn't exist, but the view does.

Hi,

The script that checks for data not supposed to be on labs hosts reported that ukwikimedia_p view is still present on labs hosts.
I checked and it is indeed present on all the hosts:

  • labsdb1001
  • labsdb1003
  • labsdb1009
  • labsdb1010
  • labsdb1011

The database itself doesn't exist, but the view does.

Dropped everywhere on labs- only waiting for replication to catch up on labsdb1003, and checking with you if meta_p database needs updating or anything else (I think you had some ongoing conversation).

Did the views come back after @jcrespo dropped them or were the backing tables dropped but not the views?

As I noted in T169488#3402456 our existing maintain-views script doesn't support cleaning up views once a db has been put into one of the "sensitive db lists". This either needs to be handled manually by someone who actually has access to privileged database passwords (I do not) or we need to make some code changes or write a new script to handle this type of cleanup.

If we changed the maintain-views code I think that adding a --drop-sensitive mode would probably be the easiest way to go. This mode would be similar to the current --all-databases + --clean options, but would create a list of only sensitive dbs and drop all of their views that exist.

Hi,

The script that checks for data not supposed to be on labs hosts reported that ukwikimedia_p view is still present on labs hosts.
I checked and it is indeed present on all the hosts:

  • labsdb1001
  • labsdb1003
  • labsdb1009
  • labsdb1010
  • labsdb1011

The database itself doesn't exist, but the view does.

Dropped everywhere on labs- only waiting for replication to catch up on labsdb1003, and checking with you if meta_p database needs updating or anything else (I think you had some ongoing conversation).

Did the views come back after @jcrespo dropped them or were the backing tables dropped but not the views?

I think Jaime dropped the databases and not the views. If that is the case, then that is the reason whey they are still there. @jcrespo can you confirm you only dropped the databases?
If so, I will drop the views and get this over with :-)

Yes, I only dropped the databases- I thought the script run took care of the views before they were dropped. We may need to drop the database from the role labsdbuser, too.

Mentioned in SAL (#wikimedia-operations) [2017-07-10T15:14:46Z] <marostegui> Drop ukwikimedia_p views from labsdb hosts - T169488

I have dropped the views from all the host.
The role doesn't need to be updated, as looks like it automatically removes the databases if they do not exist:

mysql:root@localhost [(none)]> SET ROLE labsdbuser;
Query OK, 0 rows affected (0.00 sec)
mysql:root@localhost [(none)]> show databases like '%ukwikimedia%';
Empty set (0.00 sec)
root@labsdb1009[(none)]> pager grep ukwikimedia
PAGER set to 'grep ukwikimedia'
root@labsdb1009[(none)]> SHOW GRANTS FOR labsdbuser;
| GRANT SELECT, SHOW VIEW ON `ukwikimedia\_p`.* TO 'labsdbuser'            |
842 rows in set (0.00 sec)
root@labsdb1009[(none)]> pager grep ukwikimedia
PAGER set to 'grep ukwikimedia'
root@labsdb1009[(none)]> SHOW GRANTS FOR labsdbuser;
| GRANT SELECT, SHOW VIEW ON `ukwikimedia\_p`.* TO 'labsdbuser'            |
842 rows in set (0.00 sec)

Looks like I misinterpreted the set role function :)