Make sure tools-db is backed up in some form
Closed, DuplicatePublic
Actions

Assigned To

Authored By

	yuvipanda
	Feb 5 2015, 7:22 PM

Description

I think prod databases are, so this should be too. Delayed replication and / or dumps?

Related Objects
Search...

Status	Assigned	Task
Resolved	yuvipanda	T105720 Labs team reliability goal for Q1 2015/16
Resolved	Andrew	T105723 Eliminate SPOFs in Labs infrastructure
Resolved	coren	T88234 Puppetize & fix tools-db
Duplicate	coren	T88716 Make sure tools-db is backed up in some form

Event Timeline

yuvipanda created this task.Feb 5 2015, 7:22 PM

yuvipanda assigned this task to coren.

yuvipanda raised the priority of this task from to High.

yuvipanda updated the task description. (Show Details)

yuvipanda added a project: Cloud-Services.

yuvipanda added subscribers: faidon, mark, Aklapper, yuvipanda.

coren moved this task from Triage to Backlog on the Cloud-Services board.Feb 10 2015, 9:09 PM

yuvipanda updated the task description. (Show Details)Mar 25 2015, 8:39 PM

yuvipanda set Security to None.

yuvipanda added projects: Toolforge, ToolLabs-Goals-Q4.

yuvipanda moved this task from Backlog to Redundancy on the ToolLabs-Goals-Q4 board.Mar 25 2015, 9:39 PM

(Presumably, back up offsite)

This requires one of two things: either we dump the database to labstore2001 (which also gets rsyncs of labstores) or we add a DB to codfw and slave it.

What is the purpose of the backups? If it is to guard against hardware failures & Co., I assume some replication to another server that can be pointed to by tools-db would be the best way (and this task thus a duplicate of T88718).

Hinting that user databases are backed up probably provokes support requests that someone wants to restore a table row in the form it had 39.437 days ago. In addition, I assume much (most?) of the data in the user databases is derived from replicated data and could thus be regenerated. So for "user database backups" in that sense I would instead recommend advertising that those users who need that should set up a cron job that calls mysqldump tailored to their requirements.

scfc moved this task from Backlog to Ready to be worked on on the Toolforge board.Apr 6 2015, 6:42 AM

@scfc: This is DR backups, not partially restorable backups.

yuvipanda added a parent task: T105723: Eliminate SPOFs in Labs infrastructure.Jul 13 2015, 6:28 PM

Andrew added a project: labs-sprint-116.Sep 30 2015, 7:20 PM

Andrew subscribed.

Andrew moved this task from To do to Doing on the labs-sprint-116 board.Sep 30 2015, 7:22 PM

Once T88718 is finished (most of the work has been done already), backups can be taken from the slave consistently.

A slave replica will prevent against:

Hardware issues (e.g. secondary storage broken, server fried in general)
Admin and security issues (data is rm'ed accidentally/by an attacker)

Pointing clients to the slave is trivial, and could even done automatically.

A slave replica will not prevent against:

Software logic (a bad SQL command is executed, and data is logically DROPed or MySQL has a bug which creates data loss).

For that, there are 2 options:

Regular periodic backups, that will allow to go back that period
Delayed slave, where in the event of a bad SQL command, there will be X amount of hours until the slave executes it.

Problem with backups in that host is that there are some largish databases that are directly derived from mysql production replicas, and not worth recovering. I would suggest doing a user poll/selecting specific databases to avoid duplicating 500GB.

In T88716#1185933, @coren wrote:

@scfc: This is DR backups, not partially restorable backups.

and I'd repeat my comment T88716#1181437:

[…]
Hinting that user databases are backed up probably provokes support requests that someone wants to restore a table row in the form it had 39.437 days ago. In addition, I assume much (most?) of the data in the user databases is derived from replicated data and could thus be regenerated. So for "user database backups" in that sense I would instead recommend advertising that those users who need that should set up a cron job that calls mysqldump tailored to their requirements.

I can imagine constellations where some form of point-in-time recovery is useful from a DBA perspective (imagine a maintenance script that is pointed to the wrong DB host and drops all databases), but IMHO users should be advised to mysqldump tables they think are important at intervals they deem appropriate. So if you think that replication guards against hardware issues and a DB root accidentally dropping databases is unlikely enough, I would consider this task resolved (or a duplicate of T88718).

Just a comment: I would make sure to announce the no-recovery guarantee to labs list (and intend to).

coren moved this task from Doing to Done on the labs-sprint-116 board.Oct 2 2015, 4:13 PM

• Phabricator_maintenance removed a subscriber: yuvipanda.Jun 7 2017, 6:55 PM

Make sure tools-db is backed up in some formClosed, DuplicatePublicActions

Description

Related ObjectsSearch...

Event Timeline

Make sure tools-db is backed up in some form
Closed, DuplicatePublic
Actions

Related Objects
Search...