⚓ T266663 Disconnect codfw -> eqiad replication

Marostegui created this task.Oct 28 2020, 1:00 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptOct 28 2020, 1:00 PM

Not before Thursday 29th Oct 2020

Mentioned in SAL (#wikimedia-operations) [2020-10-29T05:58:27Z] <marostegui> Disconnect replication codfw -> eqiad on pc1, pc2 and pc3 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:00 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:07:24Z] <marostegui> Disconnect replication codfw -> eqiad on x1 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:07 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:10:53Z] <marostegui> Disconnect replication codfw -> eqiad on es4 and es5 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:11 AM

Marostegui claimed this task.Oct 29 2020, 6:18 AM

Marostegui moved this task from Blocked to In progress on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:23:25Z] <marostegui> Disconnect replication codfw -> eqiad on s5 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:23 AM

Marostegui updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:36:08Z] <marostegui> Disconnect replication codfw -> eqiad on s6 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:36 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:38:17Z] <marostegui> Disconnect replication codfw -> eqiad on s7 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:38 AM

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:45 AM

Marostegui updated the task description. (Show Details)Oct 29 2020, 6:52 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T06:52:34Z] <marostegui> Disconnect replication codfw -> eqiad on s2 T266663

Mentioned in SAL (#wikimedia-operations) [2020-10-29T07:46:11Z] <marostegui> Disconnect replication codfw -> eqiad on s3 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 7:46 AM

Marostegui updated the task description. (Show Details)Oct 29 2020, 7:49 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T07:54:39Z] <marostegui> Disconnect replication codfw -> eqiad on s4 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 7:54 AM

Mentioned in SAL (#wikimedia-operations) [2020-10-29T08:02:51Z] <marostegui> Disconnect replication codfw -> eqiad on s1 T266663

Marostegui updated the task description. (Show Details)Oct 29 2020, 8:02 AM

Marostegui updated the task description. (Show Details)Oct 29 2020, 8:06 AM

GTID enabled everywhere on codfw masters:

sudo cumin "P{P:mariadb::mysql_role%role = master and *.codfw.wmnet}" 'mysql -e "show slave status\G" | grep Using'
15 hosts will be targeted:
db[2079,2090,2096,2105,2107,2112,2118,2123,2129].codfw.wmnet,es[2021,2023].codfw.wmnet,pc[2007-2010].codfw.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(7) db2096.codfw.wmnet,es[2021,2023].codfw.wmnet,pc[2007-2010].codfw.wmnet
----- OUTPUT of 'mysql -e "show s...\G" | grep Using' -----
                    Using_Gtid: Slave_Pos
===== NODE GROUP =====
(8) db[2079,2090,2105,2107,2112,2118,2123,2129].codfw.wmnet
----- OUTPUT of 'mysql -e "show s...\G" | grep Using' -----
                   Using_Gtid: Slave_Pos
================
PASS |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (15/15) [00:01<00:00, 10.00hosts/s]
FAIL |                                                                                                                                                                                                                               |   0% (0/15) [00:01<?, ?hosts/s]
100.0% (15/15) success ratio (>= 100.0% threshold) for command: 'mysql -e "show s...\G" | grep Using'.
100.0% (15/15) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.

And replication disabled everywhere on eqiad masters (pc1010 can be ignored):

sudo cumin "P{P:mariadb::mysql_role%role = master and *.eqiad.wmnet}" 'mysql -e "show slave status\G"'
20 hosts will be targeted:
db[1080-1081,1083,1086,1100,1103-1104,1107,1115,1122-1123,1128,1131-1132].eqiad.wmnet,es[1021,1024].eqiad.wmnet,pc[1007-1010].eqiad.wmnet
Confirm to continue [y/n]? y
===== NODE GROUP =====
(1) pc1010.eqiad.wmnet
----- OUTPUT of 'mysql -e "show slave status\G"' -----
*************************** 1. row ***************************
                Slave_IO_State: Waiting for master to send event
                   Master_Host: pc1007.eqiad.wmnet
                   Master_User: repl
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File: pc1007-bin.137067
           Read_Master_Log_Pos: 131496335
                Relay_Log_File: pc1010-relay-bin.037718
                 Relay_Log_Pos: 131496635
         Relay_Master_Log_File: pc1007-bin.137067
              Slave_IO_Running: Yes
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 131496335
               Relay_Log_Space: 131497586
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: Yes
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: 0
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 0
                 Last_IO_Error:
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 171966644
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 0-171966644-48816503458
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: conservative
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
              Slave_DDL_Groups: 2
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 1862570640
================
PASS |██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100% (20/20) [00:00<00:00, 31.77hosts/s]
FAIL |                                                                                                                                                                                                                               |   0% (0/20) [00:00<?, ?hosts/s]
100.0% (20/20) success ratio (>= 100.0% threshold) for command: 'mysql -e "show slave status\G"'.
100.0% (20/20) success ratio (>= 100.0% threshold) of nodes successfully executed all commands.

Disconnect codfw -> eqiad replication
Closed, ResolvedPublic
Actions

Description

Related Objects

Event Timeline

Disconnect codfw -> eqiad replicationClosed, ResolvedPublicActions

Description

Related Objects

Event Timeline

Disconnect codfw -> eqiad replication
Closed, ResolvedPublic
Actions