Page MenuHomePhabricator

Create backups of Wikimedia content in diverse geographic places
Open, Stalled, LowPublic

Description

Per discussions on Wikimedia-l, backups of Wikimedia content in diverse geographic places would reduce risks to Wikimedia content.

At least some of these backups shouldn't be under the control of the Wikimedia Foundation, and at least some of these backups should be offline backups.

Backups should include any software on which the Wikimedia content is dependent for archival, retrieval, or interactivity.

Event Timeline

Pine created this task.Jan 28 2017, 8:51 AM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJan 28 2017, 8:51 AM
Pine updated the task description. (Show Details)Jan 28 2017, 8:52 AM
Pine added a subscriber: ArielGlenn.
Jane023 added a subscriber: Jane023.
Pine updated the task description. (Show Details)Jan 28 2017, 8:57 AM
Pine updated the task description. (Show Details)
Qgil added a subscriber: Qgil.

I could not find a project tag directly related to this request, but I hope any of these groups has a better idea.

Qgil updated the task description. (Show Details)Jan 28 2017, 9:58 AM
abian added a subscriber: abian.
Framawiki rescinded a token.
Framawiki awarded a token.
Framawiki added a subscriber: Framawiki.
abian added a comment.Jan 28 2017, 1:05 PM

Personally, I would like Wikimedia chapters and the Wikimedia Foundation to cooperate more. In particular, some Wikimedia chapters have their own servers, which could host these backups. In return, the Wikimedia Foundation could also host some backups of chapters' contents. We are a single movement, we are "the same", let's behave like that, let's share our resources and reduce our own efforts.

Apart from that, I propose that we collaborate with universities so that they can host backups beyond our control.

For both, we need specific software that has to be very easy to install and configure.

jeblad added a subscriber: jeblad.Jan 28 2017, 7:34 PM

Keeping an offline copy should be an obvious duty for the larger chapters.

Mvolz added a subscriber: Mvolz.Jan 29 2017, 3:26 PM
Joe added a subscriber: Joe.Jan 30 2017, 7:39 AM
debt added a subscriber: debt.Jan 30 2017, 5:34 PM
whym awarded a token.Jan 31 2017, 8:42 AM
tstarling added a subscriber: tstarling.EditedFeb 3 2017, 2:18 AM

There are copies of the XML dumps on archive.org, but that's not really enough for disaster recovery. If we want to keep our editor community after a disaster then it's important to retain private user account information, especially email address and password.

We have bacula, which provides geographically diverse backup storage by the usual definition, with copies in Texas and Virginia. But as far as I can tell, there are no storage locations outside the US or outside WMF's control. So that could perhaps be improved.

Setting up a Bacula storage server outside of WMF's control is not quite as crazy as it might sound -- the data is encrypted. WMF could respond to an existential threat by giving the key away.

faidon added a subscriber: faidon.Feb 4 2017, 10:06 AM

We are working on our backup policy, but what is requested here -even the diverse geography, let alone the non-WMF-controlled- is much wider and more of a legal/policy matter than a technical one, so it's a little more complicated than JFDI it :) The tech is indeed mostly there (although note that Bacula alone wouldn't be enough). We are going to discuss it internally and update this task when we have more on the policy front.

elukey added a subscriber: elukey.Feb 4 2017, 12:00 PM
Ottomata triaged this task as Low priority.Mar 6 2017, 7:31 PM

We are going to discuss it internally and update this task when we have more on the policy front.

How is this going? Does the change to 'low priority' reflect a policy decision? Regards, Richard.

Pine added a subscriber: Ottomata.EditedMar 6 2017, 9:53 PM

Hi @Ottomata, I am not understanding why this would be low priority. Can you please explain?

Pine changed the task status from Open to Stalled.Dec 28 2017, 12:34 AM

Can we get an update about the status of this project, please? I am marking this as "stalled" pending an update. Thanks.

I am not understanding why this would be low priority. Can you please explain?

Because nobody works on this and there are no plans to work on this soon.

Can we get an update about the status of this project, please?

No updates have been posted in this task, hence no updates have happened.

I've heard rumors (just rumors!) about backup improvements becoming a more prioritized project in the coming year, soooo stay tuned...?

Correct, that is the intention. I can confirm that's the case that it's not a rumour, but of course take it with a grain of salt, given that the annual plan hasn't been even drafted yet, let alone finalized.

Pine added a comment.Jan 23 2018, 2:01 AM

Backup improvements in general would be welcome. Perhaps this should be a subtask of a larger "campaign" regarding backups.