Maniphest T203575

Implement a backup system for the data stored for Grant Metrics
Closed, ResolvedPublic3 Estimated Story Points
Actions

Assigned To

Authored By

	• aezell
	Sep 5 2018, 3:06 PM

Description

Value proposition (why do we need to do this)?

As a manager, I want to ensure that our users data is not lost or corrupted. As we invest more into this application, we should be more rigorous about its operational methods.

Functionality/software changes

Create a cron job to dump the database to disk. Should run once a day. Files should be stored to NFS for redundancy.
Optional: send an email or some kind of notice after each successful backup (depends on how often the backup is taken). Redirect the cron job output to /dev/null so we can only output if there's an error

User interface changes

None

QA/Testing

After deployment, ensure that the files are valid and that the cron job completes regularly. Restore the data to an empty DB to validate the dump.

Event Timeline

• aezell created this task.Sep 5 2018, 3:06 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 5 2018, 3:06 PM

• aezell added a project: Grant-Metrics.Sep 5 2018, 3:06 PM

Niharika moved this task from New & TBD Tickets to Needs Discussion on the Community-Tech board.Sep 5 2018, 8:48 PM

• aezell updated the task description. (Show Details)Sep 5 2018, 10:32 PM

• aezell updated the task description. (Show Details)Sep 5 2018, 11:04 PM

• aezell updated the task description. (Show Details)

@Niharika Can you prioritize this one?

• aezell moved this task from Needs Discussion to Up Next (May 20-June 3) on the Community-Tech board.Sep 5 2018, 11:30 PM

Samwilson claimed this task.Sep 7 2018, 4:55 AM

Samwilson added a project: Community-Tech-Sprint.

The first part of this (dumping the database daily to disk) is done. Documented here: https://wikitech.wikimedia.org/wiki/Tool:Grant_Metrics#Backups

What happens if the disk gets full? Is that possible? I don't know how storage is done in this environment.

Niharika triaged this task as Medium priority.Sep 7 2018, 3:27 PM

Are we also going to periodically delete old backups?

I was hoping that using logrotate like this would work, because it handles keeping a set number of copies of the dump and deletes old ones when they drop off the end. However:

jsub: error: argument program: Program 'logrotate' not found.

So I guess that's out (at least until we move to a VPS). :(

As for disk space, we're talking about such small amounts of data that I don't think we'll run into a problem. The backup at the moment is 40KB.

I'll fix up the backup script to not use logrotate.

Okay, it now keeps 30 days of dumps and doesn't use logrotate. I've updated the docs. Still to do:

give developers an easy way to pull the latest backup into their local DB.
mail the ~/backup.err file (which is the stderr output of the script) to maintainers (at the moment, all we'll get is "job submitted successfully").

Samwilson moved this task from Ready to In Development on the Community-Tech-Sprint board.Sep 10 2018, 12:09 AM

The backup report email will now be sent every day, listing the existing backups and their sizes.

I'm not quite sure how to handle copying the dumps though. If we don't mind handing these out to developers then we assume they're completely public info — but is that correct? Is there any worry with handing out participant lists? It seems like there might be, especially as things grow and we add other information.

What do you think of the idea of restoring to the staging site? Then it'd have real data in it, and would also verify the backups.

I think Participant lists could potentially be seen as non-public but not private. They could potentially be posted a wiki page so it's definitely not private. That said, I relational data set such that one could query, "Show me all events AEzell has participated in" collates data in a way that a myriad of postings across wiki pages doesn't.

With that in mind, I think restoring to staging is a good idea but maybe putting this data on dev machines is something we'd want to consider only if we really need it. And then, we might want to do something about scrubbing or anonymizing some of the data.

Samwilson moved this task from Up Next (May 20-June 3) to In Sprint 🏃‍♀️🏃‍♂️ on the Community-Tech board.Sep 10 2018, 11:23 PM

Okay, it's now restoring to the staging DB (daily). We'll get emailed any stderr output.

I think that's all to do here; we can say we've got a backup system?

Samwilson updated the task description. (Show Details)Sep 13 2018, 5:02 AM

@Samwilson I'd agree that this is done. Thanks!

Niharika closed this task as Resolved.Sep 13 2018, 3:35 PM

Niharika moved this task from QA to Q1 2018-19 on the Community-Tech-Sprint board.

MusikAnimal moved this task from Backlog to Done on the Grant-Metrics board.Sep 13 2018, 7:34 PM

Implement a backup system for the data stored for Grant MetricsClosed, ResolvedPublic3 Estimated Story PointsActions