Page MenuHomePhabricator

Database post-switchover tasks March 2024
Closed, ResolvedPublic

Description

Stop and disconnect replication codfw -> eqiad
To be done: TBD

  • s1
  • s2
  • s3
  • s4
  • s5
  • s6
  • s7
  • s8
  • x1
  • es4
  • es5

*pc and x2 hosts are excluded as we maintain multi-master replication at all times.

Enable GTID on codfw masters

  • s1
  • s2
  • s3
  • s4
  • s5
  • s6
  • s7
  • s8
  • x1
  • es4
  • es5

Event Timeline

Marostegui changed the task status from Open to Stalled.Feb 22 2024, 10:49 AM
Marostegui triaged this task as Medium priority.
Marostegui moved this task from Triage to Blocked on the DBA board.

Blocked until the DC switch is done

Marostegui changed the task status from Stalled to Open.Wed, Mar 20, 3:07 PM
Marostegui claimed this task.
Marostegui moved this task from Blocked to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:10:45Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 16 hosts with reason: Remove circular replication in x1 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:11:10Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 16 hosts with reason: Remove circular replication in x1 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:49:17Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Remove circular replication in es5 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:49:23Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Remove circular replication in es5 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:50:48Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Remove circular replication in es4 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:51:05Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Remove circular replication in es4 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:53:11Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 27 hosts with reason: Remove circular replication in s6 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:53:34Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 27 hosts with reason: Remove circular replication in s6 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:54:29Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 29 hosts with reason: Remove circular replication in s2 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:54:54Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 29 hosts with reason: Remove circular replication in s2 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:56:05Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 26 hosts with reason: Remove circular replication in s3 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:56:28Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 26 hosts with reason: Remove circular replication in s3 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:58:52Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:30:00 on 31 hosts with reason: Remove circular replication in s7 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T15:59:20Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 31 hosts with reason: Remove circular replication in s7 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:01:21Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:10:00 on 27 hosts with reason: Remove circular replication in s5 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:01:46Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 27 hosts with reason: Remove circular replication in s5 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:03:40Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:10:00 on 34 hosts with reason: Remove circular replication in s8 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:04:06Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 34 hosts with reason: Remove circular replication in s8 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:05:02Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:10:00 on 36 hosts with reason: Remove circular replication in s4 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:05:33Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 36 hosts with reason: Remove circular replication in s4 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:07:06Z] <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 0:10:00 on 37 hosts with reason: Remove circular replication in s1 T358200

Mentioned in SAL (#wikimedia-operations) [2024-03-20T16:07:39Z] <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on 37 hosts with reason: Remove circular replication in s1 T358200

Marostegui updated the task description. (Show Details)

This is all done