Page MenuHomePhabricator
Paste P13330

(An Untitled Masterwork)
ActivePublic

Authored by Marostegui on Wed, Nov 18, 1:55 PM.
root@mariadb104-test1:~# orchestrator -c graceful-master-takeover -i zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Hostname unresolved yet: zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Connected to orchestrator backend: sqlite on /var/lib/orchestrator/orchestrator.db
2020-11-18 13:54:38 DEBUG Initializing orchestrator
2020-11-18 13:54:38 INFO Connecting to backend :3306: maxConnections: 128, maxIdleConns: 32
2020-11-18 13:54:38 DEBUG Cache hostname resolve zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud as zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 INFO verifyCert requested, client certificates will be verified
2020-11-18 13:54:38 INFO Read in CA file: /etc/ssl/certs/Puppet_Internal_CA.pem
2020-11-18 13:54:38 INFO GracefulMasterTakeover: Will demote zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 and promote zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 instead
2020-11-18 13:54:38 ERROR Error 1142: SELECT command denied to user 'orchestrator'@'172.16.1.250' for table 'slave_master_info'
2020-11-18 13:54:38 DEBUG Hostname unresolved yet: slave2.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Cache hostname resolve slave2.mariadb104-test.eqiad1.wikimedia.cloud as slave2.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Hostname unresolved yet: slave1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Cache hostname resolve slave1.mariadb104-test.eqiad1.wikimedia.cloud as slave1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 INFO topology_recovery: No PreGracefulTakeoverProcesses hooks to run
2020-11-18 13:54:38 INFO GracefulMasterTakeover: Will set zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 as read_only
2020-11-18 13:54:38 INFO instance zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 read_only: true
2020-11-18 13:54:38 INFO GracefulMasterTakeover: Will wait for zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 to reach master coordinates zarcillo1-bin.000013:959
2020-11-18 13:54:38 DEBUG Hostname unresolved yet: zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Cache hostname resolve zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud as zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Hostname unresolved yet: zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 DEBUG Cache hostname resolve zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud as zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud
2020-11-18 13:54:38 INFO GracefulMasterTakeover: attempting recovery
2020-11-18 13:54:38 INFO executeCheckAndRecoverFunction: proceeding with DeadMaster detection on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306; isActionable?: true; skipProcesses: false
2020-11-18 13:54:38 INFO topology_recovery: detected DeadMaster failure on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO topology_recovery: No OnFailureDetectionProcesses hooks to run
2020-11-18 13:54:38 INFO executeCheckAndRecoverFunction: proceeding with DeadMaster recovery on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306; isRecoverable?: true; skipProcesses: false
2020-11-18 13:54:38 INFO topology_recovery: will handle DeadMaster event on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO topology_recovery: No PreFailoverProcesses hooks to run
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: will recover zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: masterRecoveryType=MasterRecoveryGTID
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: regrouping replicas via GTID
2020-11-18 13:54:38 DEBUG GetCandidateReplica: candidate: zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306, ahead: 0, equal: 0, late: 0, break: 0
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: promotedReplicaIsIdeal(zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306)
2020-11-18 13:54:38 DEBUG RegroupReplicasGTID: working on 0 replicas
2020-11-18 13:54:38 INFO Started replication on zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 DEBUG RegroupReplicasGTID: done
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: 1 postponed functions
2020-11-18 13:54:38 INFO topology_recovery: checking if should replace promoted replica with a better candidate
2020-11-18 13:54:38 INFO topology_recovery: + sanity check: found our very own server to promote; doing nothing
2020-11-18 13:54:38 INFO topology_recovery: replace-promoted-replica-with-candidate: promoted instance zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 requires no further action
2020-11-18 13:54:38 INFO topology_recovery: promoted replica: zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO topology_recovery: RecoverDeadMaster: successfully promoted zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO topology_recovery: - RecoverDeadMaster: promoted server coordinates: zarcillo0-bin.000022:577772
2020-11-18 13:54:38 INFO topology_recovery: - RecoverDeadMaster: will apply MySQL changes to promoted master
2020-11-18 13:54:38 INFO Will reset replica on zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:38 INFO Stopped replication on zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306, Self:zarcillo0-bin.000022:577772, Exec:zarcillo1-bin.000013:959
2020-11-18 13:54:39 INFO Reset replication zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:39 INFO topology_recovery: - RecoverDeadMaster: applying RESET SLAVE ALL on promoted master: success=true
2020-11-18 13:54:39 INFO instance zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 read_only: false
2020-11-18 13:54:39 INFO topology_recovery: - RecoverDeadMaster: applying read-only=0 on promoted master: success=true
2020-11-18 13:54:39 INFO topology_recovery: Writing KV [mysql/master/db_inventory:zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 mysql/master/db_inventory/hostname:zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud mysql/master/db_inventory/port:3306 mysql/master/db_inventory/ipv4:172.16.2.16 mysql/master/db_inventory/ipv6:]
2020-11-18 13:54:39 INFO topology_recovery: Distributing KV [mysql/master/db_inventory:zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 mysql/master/db_inventory/hostname:zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud mysql/master/db_inventory/port:3306 mysql/master/db_inventory/ipv4:172.16.2.16 mysql/master/db_inventory/ipv6:]
2020-11-18 13:54:39 INFO topology_recovery: - RecoverDeadMaster: updating cluster_alias: zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 -> zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:39 INFO topology_recovery: - RecoverDeadMaster: detaching master host on promoted master
2020-11-18 13:54:39 INFO topology_recovery: No PostMasterFailoverProcesses hooks to run
2020-11-18 13:54:39 INFO instance zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 read_only: true
2020-11-18 13:54:39 INFO topology_recovery: - RecoverDeadMaster: applying read-only=1 on demoted master: success=true
2020-11-18 13:54:39 INFO Topology recovery: {"Id":1,"UID":"1605707678636816844:9d4a6d646208f1fa0077ff2063a1dc2efd0d817158d253481b2dc3aa174728ff","AnalysisEntry":{"AnalyzedInstanceKey":{"Hostname":"zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud","Port":3306},"AnalyzedInstanceMasterKey":{"Hostname":"","Port":0},"ClusterDetails":{"ClusterName":"zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306","ClusterAlias":"db_inventory","ClusterDomain":"","CountInstances":4,"HeuristicLag":0,"HasAutomatedMasterRecovery":false,"HasAutomatedIntermediateMasterRecovery":false},"AnalyzedInstanceDataCenter":"eqiad","AnalyzedInstanceRegion":"","AnalyzedInstancePhysicalEnvironment":"","AnalyzedInstanceBinlogCoordinates":{"LogFile":"zarcillo1-bin.000013","LogPos":959,"Type":0},"IsMaster":true,"IsCoMaster":false,"LastCheckValid":true,"LastCheckPartialSuccess":true,"CountReplicas":1,"CountValidReplicas":1,"CountValidReplicatingReplicas":1,"CountReplicasFailingToConnectToMaster":0,"CountDowntimedReplicas":0,"ReplicationDepth":0,"Replicas":[{"Hostname":"zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud","Port":3306}],"SlaveHosts":[{"Hostname":"zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud","Port":3306}],"IsFailingToConnectToMaster":false,"Analysis":"DeadMaster","Description":"","StructureAnalysis":null,"IsDowntimed":false,"IsReplicasDowntimed":false,"DowntimeEndTimestamp":"","DowntimeRemainingSeconds":0,"IsBinlogServer":false,"PseudoGTIDImmediateTopology":false,"OracleGTIDImmediateTopology":false,"MariaDBGTIDImmediateTopology":true,"BinlogServerImmediateTopology":false,"SemiSyncMasterEnabled":false,"SemiSyncMasterStatus":false,"SemiSyncMasterWaitForReplicaCount":0,"SemiSyncMasterClients":0,"CountSemiSyncReplicasEnabled":0,"CountLoggingReplicas":1,"CountStatementBasedLoggingReplicas":1,"CountMixedBasedLoggingReplicas":0,"CountRowBasedLoggingReplicas":0,"CountDistinctMajorVersionsLoggingReplicas":1,"CountDelayedReplicas":0,"CountLaggingReplicas":0,"IsActionableRecovery":true,"ProcessingNodeHostname":"mariadb104-test1","ProcessingNodeToken":"57a49736d90021dbe29daae60329da37078d492405848368d5d7e735933a60f1","CountAdditionalAgreeingNodes":0,"StartActivePeriod":"","SkippableDueToDowntime":false,"GTIDMode":"","MinReplicaGTIDMode":"","MaxReplicaGTIDMode":"","MaxReplicaGTIDErrant":"","CommandHint":"graceful-master-takeover","IsReadOnly":false},"SuccessorKey":{"Hostname":"zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud","Port":3306},"SuccessorAlias":"","IsActive":false,"IsSuccessful":true,"LostReplicas":[],"ParticipatingInstanceKeys":[],"AllErrors":[],"RecoveryStartTimestamp":"","RecoveryEndTimestamp":"","ProcessingNodeHostname":"","ProcessingNodeToken":"","Acknowledged":false,"AcknowledgedAt":"","AcknowledgedBy":"","AcknowledgedComment":"","LastDetectionId":0,"RelatedRecoveryId":0,"Type":"MasterRecovery","RecoveryType":"MasterRecoveryGTID"}
2020-11-18 13:54:39 INFO topology_recovery: No PostFailoverProcesses hooks to run
2020-11-18 13:54:39 INFO topology_recovery: Waiting for 2 postponed functions
2020-11-18 13:54:39 DEBUG PostponedFunctionsContainer: waiting on 2 postponed functions
2020-11-18 13:54:39 DEBUG PostponedFunctionsContainer: done waiting
2020-11-18 13:54:39 INFO topology_recovery: Executed 2 postponed functions
2020-11-18 13:54:39 INFO topology_recovery: Executed postponed functions: regroup-replicas-gtid zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306, RecoverDeadMaster, detaching promoted master host zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:39 DEBUG ChangeMasterTo: will attempt changing master on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 to zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306, zarcillo0-bin.000022:577772
2020-11-18 13:54:39 INFO ChangeMasterTo: Changed master on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306 to: zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306, zarcillo0-bin.000022:577772. GTID: true
2020-11-18 13:54:39 DEBUG EnableMasterSSL: Will attempt enabling SSL replication on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:39 INFO EnableMasterSSL: Enabled SSL replication on zarcillo1.mariadb104-test.eqiad1.wikimedia.cloud:3306
2020-11-18 13:54:39 INFO topology_recovery: No PostGracefulTakeoverProcesses hooks to run
zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306
zarcillo0-bin.000022:577772
2020-11-18 13:54:39 DEBUG Promoted zarcillo0.mariadb104-test.eqiad1.wikimedia.cloud:3306 as new master. Binlog coordinates at time of promotion: zarcillo0-bin.000022:577772

Event Timeline

Marostegui created this paste.Wed, Nov 18, 1:55 PM