Page MenuHomePhabricator

DB backup plan for ptwikibooks Flow cleanup
Closed, ResolvedPublic

Description

We plan to run a maintenance script to delete a bunch of incorrectly imported topics on ptwikibooks soon (https://phabricator.wikimedia.org/T119509#2304693)

It had been tested on beta & locally, and I'm doing a dryrun first & look at the results.
But you never know...

I expect for roughly 25k flow_workflow entries, 75k flow_revision entries, 50k flow_tree_node & flow_tree_revision entries, 190k flow_wiki_ref & 40k flow_ext_ref entries to be deleted (all of them in flowdb, on extension1). There should be no alterations in ptwikibooks db.
They'll be batched & waitForSlaves per 10 topics.

What would be the best backup plan in case something goes wrong (for both databases, just to be sure)? mysqldump?

Event Timeline

Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptMay 24 2016, 9:36 PM
jcrespo closed this task as Resolved.May 26 2016, 9:02 AM
jcrespo claimed this task.
jcrespo added a subscriber: jcrespo.

I have just made a custom backup of flowdb with all tables on x1. That is in addition to the weekly backups that we do of all of x1. If a later recovery is needed, we can do point in time recovery with the binary logs, assuming the process does not take more than 15 days to happen.

We could do it faster, if it was needed, but I think a recovery time of ~1 hour should be enough for something you say likely will not happen.

Note, in theory, we have 2 delayed slaves with x1, but as x1 has lower write load, it doesn't work very well, but we could stop replication on the delayed slave before the maintenance to have a second option.

matthiasmullie added a comment.EditedMay 27 2016, 12:08 AM

I think stopping replication is overkill (plus, there's more on x1, like Echo)

I'll probably request another custom backup again at some point: we can't yet run the script (there are some additional steps to be taken + community feedback/approval)

Hey @jcrespo: we're ready to start running these scripts now.

Are we still in good shape to run this? I assume you may want to take another manual backup?
Both ptwikibooks core db (page, revision, text, logging etc tables) & flowdb on ext1 will be affected.

Please add me to the relevant ticket, and I will update it there.