Implement a system to automatically deploy schema changes without needing DBA intervention
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	jcrespo
	Dec 18 2015, 11:31 AM

Description

Probably this will require an automatic pooling/depooling system, depending on rearchitecturing (T119626).

Related Objects

Mentioned In: T121335: Create oauth_registered_consumer.oarc_owner_only column on metawiki
Mentioned Here: T288235: Basic automation of schema changes
T119626: Eliminate SPOF at the main database infrastructure

Event Timeline

jcrespo created this task.Dec 18 2015, 11:31 AM

jcrespo raised the priority of this task from to Needs Triage.

jcrespo updated the task description. (Show Details)

jcrespo added projects: Epic, DBA, Release-Engineering-Team.

jcrespo subscribed.

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptDec 18 2015, 11:31 AM

jcrespo mentioned this in T121335: Create oauth_registered_consumer.oarc_owner_only column on metawiki.Dec 18 2015, 11:32 AM

Krenair subscribed.Dec 18 2015, 12:31 PM

Ricordisamoa subscribed.Dec 18 2015, 3:33 PM

mark subscribed.Dec 28 2015, 12:45 PM

hoo subscribed.Jan 20 2016, 2:43 PM

greg moved this task from INBOX to Epics (ARCHIVED) on the Release-Engineering-Team board.Mar 11 2016, 10:09 PM

jcrespo moved this task from Triage to Meta/Epic on the DBA board.Nov 10 2016, 12:05 PM

Now that we are getting closer to have a way to automatic depool/repool slaves, I would like to start a conversation about the future of this task.

Once we have active-active we can no longer deploy schema changes on the passive DC with replication, and instead we have to go host by host no matter what.
However, we do have to do it with replication for the sanitarium masters, specially for those changes that are likely to break replication with ROW based if not done in the correct other (adding/dropping columns, changing PKs etc).

The fact that we are starting to work towards having a database with all the hosts metadata (the new tables that live in tendril DB) makes things a bit easier for this in the future, as we can deploy the changes on a per server basis with a SQL query to select them.
However, by looking at the tables, we might need to tweak them a bit, as right now there is no way to identify what is a sanitarium master, and that'd be the only case in which the script shouldn't use set session sql_log_bin=0.

Also, ideally we should be able to use parameters as (examples of things I can think of):

--depool: if we want the slave to be depooled for the alter table (ie: a change that will generate lag, a change that will likely run into metadata locking issues). If --depool is used we should probably include a downtime by default. The slave should be automatically repooled once the alter is done and there is no lag.
This isn't trivial as if we are depooling an vslow host for instance, we'd need to repool another host in the meanwhile.

--no-master: if we don't want the masters to be altered, because it is an alter that will lock the table for some reason (ie: changing column data type).

--from-master: If there is a schema change that can be done directly on the primary master with replication (ie: such a tiny table or even empty that it can be done directly on the master and let it replicate). We don't have many of these, but worth considering.

• Phabricator_maintenance edited projects, added Release-Engineering-Team-TODO; removed Release-Engineering-Team.Jun 12 2019, 11:40 PM

• Phabricator_maintenance moved this task from Should be empty (use Release-Engineering-Team) to Epics on the Release-Engineering-Team-TODO board.Jun 12 2019, 11:41 PM

greg added a project: Release-Engineering-Team.Jun 21 2019, 10:35 PM

greg edited projects, added Release-Engineering-Team (Deployment services); removed Release-Engineering-Team.Aug 1 2019, 11:17 PM

jcrespo added a parent task: T246435: Create or improve a tool for monitoring or automating tasks for Wikimedia databases.Feb 28 2020, 10:58 AM

jcrespo removed a parent task: T246435: Create or improve a tool for monitoring or automating tasks for Wikimedia databases.May 6 2020, 7:32 AM

thcipriani removed a project: Release-Engineering-Team (Deployment services).Apr 20 2021, 1:10 AM

thcipriani edited projects, added Release-Engineering-Team (thcipriani-workboard-fiddling); removed Release-Engineering-Team-TODO.Apr 20 2021, 3:42 AM

thcipriani moved this task from thcipriani-workboard-fiddling to Seen (ARCHIVE) on the Release-Engineering-Team board.Apr 20 2021, 4:05 AM

thcipriani edited projects, added Release-Engineering-Team; removed Release-Engineering-Team (thcipriani-workboard-fiddling).

thcipriani edited projects, added Release-Engineering-Team (Seen); removed Release-Engineering-Team.Apr 20 2021, 3:23 PM