Page MenuHomePhabricator

Drop DB tables for now-deleted zerowiki from production
Closed, ResolvedPublic

Description

zerowiki (on s3) has been dropped from appserver config and can now have its tables deleted. (A backup of the tables just in case would be great.)

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

jcrespo moved this task from Triage to Backlog on the DBA board.
jcrespo added subscribers: Marostegui, jcrespo.

Needs some research to see if it is safe. Low priority because it shouldn't block any other task.

Privacy review

Database tables:

MariaDB [zerowiki]> show tables;
+--------------------------+
| Tables_in_zerowiki       |
+--------------------------+
| abuse_filter             |
| abuse_filter_action      |
| abuse_filter_history     |
| abuse_filter_log         |
| actor                    |
| archive                  |
| babel                    |
| betafeatures_user_counts |
| bot_passwords            |
| bv2015_edits             |
| bv2017_edits             |
| category                 |
| categorylinks            |
| change_tag               |
| change_tag_def           |
| comment                  |
| content                  |
| content_models           |
| cu_changes               |
| cu_log                   |
| externallinks            |
| filearchive              |
| filejournal              |
| geo_tags                 |
| global_block_whitelist   |
| globalblocks             |
| hidden                   |
| image                    |
| imagelinks               |
| interwiki                |
| ip_changes               |
| ipblocks                 |
| ipblocks_restrictions    |
| iwlinks                  |
| job                      |
| l10n_cache               |
| langlinks                |
| linter                   |
| log_search               |
| logging                  |
| math                     |
| mathoid                  |
| module_deps              |
| oathauth_users           |
| objectcache              |
| oldimage                 |
| page                     |
| page_props               |
| page_restrictions        |
| pagelinks                |
| protected_titles         |
| querycache               |
| querycache_info          |
| querycachetwo            |
| recentchanges            |
| redirect                 |
| revision                 |
| revision_actor_temp      |
| revision_comment_temp    |
| searchindex              |
| securepoll_cookie_match  |
| securepoll_elections     |
| securepoll_entity        |
| securepoll_lists         |
| securepoll_msgs          |
| securepoll_options       |
| securepoll_properties    |
| securepoll_questions     |
| securepoll_strike        |
| securepoll_voters        |
| securepoll_votes         |
| site_identifiers         |
| site_stats               |
| sites                    |
| slot_roles               |
| slots                    |
| spoofuser                |
| templatelinks            |
| text                     |
| transcache               |
| transcode                |
| updatelog                |
| uploadstash              |
| user                     |
| user_former_groups       |
| user_groups              |
| user_newtalk             |
| user_properties          |
| watchlist                |
+--------------------------+
89 rows in set (0.00 sec)

The main places where data we can't keep for more than 90 days is stored are cu_*, recentchanges and abuse_*. All these tables are already empty. Data in ip_changes is not private but only has 2 edits by localhost made during installation anyway:

MariaDB [zerowiki]> select * from ip_changes;
+------------+-------------------+----------+
| ipc_rev_id | ipc_rev_timestamp | ipc_hex  |
+------------+-------------------+----------+
|          1 | 20140401184017    | 7F000001 |
|          2 | 20140401184018    | 7F000001 |
+------------+-------------------+----------+
2 rows in set (0.00 sec)

Unless someone knows something else, I'd say it's safe to store a backup of this database for a long time.

I'd say it's safe to store a backup

You probably weren't answering me, but safe on my previous comment was meant not to the backup itself, but to the drop. Inter-wiki dependencies (commons, wikidata, interwikis, etc. are known to create outages on other wikis in the past, and there is no existing procedure in place for that. The drop & archive is trivial. Doing it without affecting other wikis may be more complicated- e.g. random extension that setups a job or a cron, or cross-dc queries, or authentication strings related to centralauth that breaks on all wikis because one is "missing"; that is why it needs some careful procedure.

In fact, https://wikitech.wikimedia.org/wiki/Delete_a_wiki says:

When we delete a wiki, the steps outlined below need to be performed, which makes MediaWiki claim that the wiki does not exist. Actually deleting the data from all of the database servers would be quite a bit more complicated, and is usually not worth doing.

But there is always a first, if someone volunteers to help! :-D

Perhaps a compromise would be to "backup and truncate" instead of "backup and drop". That would mean any attempted query would still superficially pass. One could then also revoke the write permission for that db from the MW user to ensure we get hard failures for any new information attempted to be inserted by some means.

Actually, db grants might also be a good first step from the other direction – instead of truncating, revoke both the read and write grant from the MW user. That step would be easy to reverse, but if without issue for weeks, could then lead to a proper drop without much risk.

Actually, db grants might also be a good first step from the other direction – instead of truncating, revoke both the read and write grant from the MW user. That step would be easy to reverse, but if without issue for weeks, could then lead to a proper drop without much risk.

This might be hard to do, as the way we have grants wouldn't allow us to revoke grants from an specific wiki, as we grant them for 'wik%', so we'd need to basically change them add all the wikis individually and then later, remove them from this specific wiki.

That's way easier then. We should still get a full logical dump before attempting to do so.

@jcrespo can we take a logical backup from this specific wiki so I can truncate its tables?
The wiki is super small (62M on disk)

I have created both a mydumper and a mysqldump exports of zerowiki. It was anyway being backed up regularly. You can see them at:

  • dbprov1002:/srv/backups/dumps/latest/dump.zerowiki.2020-08-05--09-08-17 (will be copied to bacula)
  • cumin1001:/home/jynus/zerowiki.sql.gz (taken directly from primary db)

Thank you!
@Jdforrester-WMF I can proceed with the truncate now. However, I do have a last question, do we have to truncate also the external store (the content) tables too or we are ok with just truncating the metadata?

And same question goes for x1 tables?

And same question goes for x1 tables?

Disregard that, there are no tables for this wiki there.

Thank you!
@Jdforrester-WMF I can proceed with the truncate now. However, I do have a last question, do we have to truncate also the external store (the content) tables too or we are ok with just truncating the metadata?

I think the orphaned ES records will be just ignored, right? That's fine from my end, yes.

Marostegui moved this task from Backlog to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2020-08-06T06:57:04Z] <marostegui> Truncate tables on zerowiki T227717

All tables truncated:

abuse_filter
+----------+
| count(*) |
+----------+
|        0 |
+----------+
abuse_filter_action
+----------+
| count(*) |
+----------+
|        0 |
+----------+
abuse_filter_history
+----------+
| count(*) |
+----------+
|        0 |
+----------+
abuse_filter_log
+----------+
| count(*) |
+----------+
|        0 |
+----------+
actor
+----------+
| count(*) |
+----------+
|        0 |
+----------+
archive
+----------+
| count(*) |
+----------+
|        0 |
+----------+
babel
+----------+
| count(*) |
+----------+
|        0 |
+----------+
betafeatures_user_counts
+----------+
| count(*) |
+----------+
|        0 |
+----------+
bot_passwords
+----------+
| count(*) |
+----------+
|        0 |
+----------+
bv2015_edits
+----------+
| count(*) |
+----------+
|        0 |
+----------+
bv2017_edits
+----------+
| count(*) |
+----------+
|        0 |
+----------+
category
+----------+
| count(*) |
+----------+
|        0 |
+----------+
categorylinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
change_tag
+----------+
| count(*) |
+----------+
|        0 |
+----------+
change_tag_def
+----------+
| count(*) |
+----------+
|        0 |
+----------+
comment
+----------+
| count(*) |
+----------+
|        0 |
+----------+
content
+----------+
| count(*) |
+----------+
|        0 |
+----------+
content_models
+----------+
| count(*) |
+----------+
|        0 |
+----------+
cu_changes
+----------+
| count(*) |
+----------+
|        0 |
+----------+
cu_log
+----------+
| count(*) |
+----------+
|        0 |
+----------+
externallinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
filearchive
+----------+
| count(*) |
+----------+
|        0 |
+----------+
geo_tags
+----------+
| count(*) |
+----------+
|        0 |
+----------+
global_block_whitelist
+----------+
| count(*) |
+----------+
|        0 |
+----------+
hidden
+----------+
| count(*) |
+----------+
|        0 |
+----------+
image
+----------+
| count(*) |
+----------+
|        0 |
+----------+
imagelinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
interwiki
+----------+
| count(*) |
+----------+
|        0 |
+----------+
ip_changes
+----------+
| count(*) |
+----------+
|        0 |
+----------+
ipblocks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
ipblocks_restrictions
+----------+
| count(*) |
+----------+
|        0 |
+----------+
iwlinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
job
+----------+
| count(*) |
+----------+
|        0 |
+----------+
l10n_cache
+----------+
| count(*) |
+----------+
|        0 |
+----------+
langlinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
linter
+----------+
| count(*) |
+----------+
|        0 |
+----------+
log_search
+----------+
| count(*) |
+----------+
|        0 |
+----------+
logging
+----------+
| count(*) |
+----------+
|        0 |
+----------+
mathoid
+----------+
| count(*) |
+----------+
|        0 |
+----------+
module_deps
+----------+
| count(*) |
+----------+
|        0 |
+----------+
oathauth_users
+----------+
| count(*) |
+----------+
|        0 |
+----------+
objectcache
+----------+
| count(*) |
+----------+
|        0 |
+----------+
oldimage
+----------+
| count(*) |
+----------+
|        0 |
+----------+
page
+----------+
| count(*) |
+----------+
|        0 |
+----------+
page_props
+----------+
| count(*) |
+----------+
|        0 |
+----------+
page_restrictions
+----------+
| count(*) |
+----------+
|        0 |
+----------+
pagelinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
protected_titles
+----------+
| count(*) |
+----------+
|        0 |
+----------+
querycache
+----------+
| count(*) |
+----------+
|        0 |
+----------+
querycache_info
+----------+
| count(*) |
+----------+
|        0 |
+----------+
querycachetwo
+----------+
| count(*) |
+----------+
|        0 |
+----------+
recentchanges
+----------+
| count(*) |
+----------+
|        0 |
+----------+
redirect
+----------+
| count(*) |
+----------+
|        0 |
+----------+
revision
+----------+
| count(*) |
+----------+
|        0 |
+----------+
revision_actor_temp
+----------+
| count(*) |
+----------+
|        0 |
+----------+
revision_comment_temp
+----------+
| count(*) |
+----------+
|        0 |
+----------+
searchindex
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_cookie_match
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_elections
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_entity
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_lists
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_msgs
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_options
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_properties
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_questions
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_strike
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_voters
+----------+
| count(*) |
+----------+
|        0 |
+----------+
securepoll_votes
+----------+
| count(*) |
+----------+
|        0 |
+----------+
site_identifiers
+----------+
| count(*) |
+----------+
|        0 |
+----------+
site_stats
+----------+
| count(*) |
+----------+
|        0 |
+----------+
sites
+----------+
| count(*) |
+----------+
|        0 |
+----------+
slot_roles
+----------+
| count(*) |
+----------+
|        0 |
+----------+
slots
+----------+
| count(*) |
+----------+
|        0 |
+----------+
spoofuser
+----------+
| count(*) |
+----------+
|        0 |
+----------+
templatelinks
+----------+
| count(*) |
+----------+
|        0 |
+----------+
text
+----------+
| count(*) |
+----------+
|        0 |
+----------+
transcache
+----------+
| count(*) |
+----------+
|        0 |
+----------+
transcode
+----------+
| count(*) |
+----------+
|        0 |
+----------+
updatelog
+----------+
| count(*) |
+----------+
|        0 |
+----------+
uploadstash
+----------+
| count(*) |
+----------+
|        0 |
+----------+
user
+----------+
| count(*) |
+----------+
|        0 |
+----------+
user_former_groups
+----------+
| count(*) |
+----------+
|        0 |
+----------+
user_groups
+----------+
| count(*) |
+----------+
|        0 |
+----------+
user_newtalk
+----------+
| count(*) |
+----------+
|        0 |
+----------+
user_properties
+----------+
| count(*) |
+----------+
|        0 |
+----------+
watchlist
+----------+
| count(*) |
+----------+
|        0 |
+----------+