As we prepare dispatch further for production usage, we should consider a backup strategy. This task should address strategy and track the implementation of this effort.
Description
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | None | T308467 implementing an incident response workflow automation tool for SRE | |||
Declined | None | T313228 Deploy Dispatch for SRE incident workflow automation | |||
Declined | fgiunchedi | T313229 Production Dispatch Infrastructure | |||
Declined | None | T326843 Dispatch Data protection (e.g. data exported and backed up) |
Event Timeline
Please remind me which persistence technology dispatch uses? We should be ready, even if not perfect, for the most common ones.
Sadly, the current PostgreSQL workflow is not supported by the data persistence team. We plan to do so, but we cannot at the moment (we don't have enough resources, and the current workflows is too faulty and likely to fail). That doesn't mean a recovery plan cannot be setup- there are postgres services being backed up, and puppet code for it, but it has to be entirely maintained by the individual service owners, as those services were warned Data Persistence cannot take ownership of that at the moment (including backups). Data persistence role will start at the Bacula boundary- we will take any file and store it long term for you, but cannot handle at the moment the exporting consistently, monitoring, and recovering a postgres database, as there are no tools already deployed at WMF that can do that with the expected reliability.
As Dispatch has been rejected as WMF's solution and is tasked for decommision (T344937), I'm closing this.