Page MenuHomePhabricator

Implement a system to automatically deploy schema changes without needing DBA intervention
Closed, DuplicatePublic

Description

Probably this will require an automatic pooling/depooling system, depending on rearchitecturing (T119626).

Event Timeline

jcrespo raised the priority of this task from to Needs Triage.
jcrespo updated the task description. (Show Details)
jcrespo subscribed.
Marostegui triaged this task as Medium priority.Aug 2 2018, 6:48 AM
Marostegui subscribed.

Now that we are getting closer to have a way to automatic depool/repool slaves, I would like to start a conversation about the future of this task.

Once we have active-active we can no longer deploy schema changes on the passive DC with replication, and instead we have to go host by host no matter what.
However, we do have to do it with replication for the sanitarium masters, specially for those changes that are likely to break replication with ROW based if not done in the correct other (adding/dropping columns, changing PKs etc).

The fact that we are starting to work towards having a database with all the hosts metadata (the new tables that live in tendril DB) makes things a bit easier for this in the future, as we can deploy the changes on a per server basis with a SQL query to select them.
However, by looking at the tables, we might need to tweak them a bit, as right now there is no way to identify what is a sanitarium master, and that'd be the only case in which the script shouldn't use set session sql_log_bin=0.

Also, ideally we should be able to use parameters as (examples of things I can think of):

--depool: if we want the slave to be depooled for the alter table (ie: a change that will generate lag, a change that will likely run into metadata locking issues). If --depool is used we should probably include a downtime by default. The slave should be automatically repooled once the alter is done and there is no lag.
This isn't trivial as if we are depooling an vslow host for instance, we'd need to repool another host in the meanwhile.

--no-master: if we don't want the masters to be altered, because it is an alter that will lock the table for some reason (ie: changing column data type).

--from-master: If there is a schema change that can be done directly on the primary master with replication (ie: such a tiny table or even empty that it can be done directly on the master and let it replicate). We don't have many of these, but worth considering.

Isn't this a duplicate of T288235: Basic automation of schema changes? Or that would be a part before moving this one?

It is essentially the same, I would merge both.

Given that the other had more concrete information and was better scoped. I merged this to the other.