Page MenuHomePhabricator

Drop DB tables for now-deleted fixcopyrightwiki from production
Open, MediumPublic

Description

fixcopyrightwiki (on s3) has been dropped from appserver config and can now have its tables deleted. No back-up needed.

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 25 2020, 1:05 AM
Marostegui triaged this task as Medium priority.Feb 25 2020, 5:58 AM
Marostegui moved this task from Triage to Backlog on the DBA board.
Marostegui added a subscriber: Marostegui.

Let's truncate them, same as T227717#5806662

As this wiki has been removed everywhere, this triggered the check_private data alert to let us know there's "private" data on sanitarium hosts (and labs hosts) as this week doesn't show up on the dblists anymore.
We need to either exclude it from the check, or go ahead and drop it from sanitarium master (with replication) which is db1112 for s3.

Mentioned in SAL (#wikimedia-operations) [2020-03-04T13:14:09Z] <marostegui> Drop fixcopyrightwiki from sanitarium hosts (db1112, db2074) to avoid getting the data alert - T246055

In order to avoid this alert from firing, I have dropped this database on db1112 and db2074 (sanitarium masters) with replication enabled, so it has been dropped from sanitarium and labs hosts.
I took a backup (1.6M) of its tables just in case, which is temporary at:

root@cumin1001:/home/marostegui/T246055# ls -lh
total 3.1M
-rw-r--r-- 1 root root 1.6M Mar  4 13:17 codfw_fixcopyrightwiki.sql
-rw-r--r-- 1 root root 1.6M Mar  4 13:15 eqiad_fixcopyrightwiki.sql
Marostegui added a project: Data-Services.EditedMar 9 2020, 8:10 AM
Marostegui added subscribers: JHedden, Bstorm, bd808.

@Bstorm @JHedden @bd808 can you guys please remove the view for this database? It has already been deleted, but the views are still there and hence triggering the private data check.
I guess I can just issue a drop database fixcopyrightwiki_p directly, but I am wondering if this better be done via maintain-views?
Thanks!

bd808 moved this task from Backlog to Wiki replicas on the Data-Services board.
bd808 moved this task from Inbox to Clinic Duty on the cloud-services-team (Kanban) board.

@Marostegui maintain-views can clean up views individually, but it won't drop the DB in any case. May as well just drop it by hand.

Thanks Brooke, I will drop them manually then.

Mentioned in SAL (#wikimedia-operations) [2020-03-11T07:38:23Z] <marostegui> fixcopyrightwiki_p views from labs hosts T246055

Done from the wikireplicas

root@cumin1001:~# for i in labsdb1009 labsdb1010 labsdb1011 labsdb1012; do echo $i; mysql.py -h$i -e "show databases like 'fixcopyright%'";done
labsdb1009
labsdb1010
labsdb1011
labsdb1012

We should not get more private data alerts.

Is there anything that still needs to be done on this task?

Everything :)
We just removed the tables from labs infra .
Dropping wikis isn't something trivial.
We could truncate them though

@Jdforrester-WMF like we've done with some other wikis in the past, can we just truncate the tables and consider this done?

That's fine by me.

Bugreporter added a subscriber: Bugreporter.EditedSep 10 2020, 3:14 PM

Compare:
(not to do so)
T169928: Evaluate how hard would be to get aa(wikibooks|wiktionary) and howiki databases deleted
T227717: Drop DB tables for now-deleted zerowiki from production
(to do so but first rename them)
T260112: Remove muswiki and mhwiktionary from s3

Personally I support the latter - first rename database of each of deleted wikis, then delete them. (For a reason: deleted wikis may contains outdated database schema, and may cause issues if the database is somehow used elsewhere)

We cannot rename a database, that's not supported by Mysql unfortunately :-(

So we can rename all tables.

@Bugreporter what's the benefit of renaming if they need to be truncated anyways?