Page MenuHomePhabricator

Reboot es1024 (es5 master)
Closed, ResolvedPublic

Description

This will pick up the kernel fix for T261389.

As we're not changing the es5 master, just rebooting it, we can use a simplified version of the normal maintenance procedure.

Time: Wed, 2020-11-25, 0800 UTC (0900 CET)

Steps:

  • Downtime all es5 hosts: sudo -H cookbook sre.hosts.downtime --minutes 30 -r "Reboot es1024 for kernel upgrade T268469" 'A:db-section-es5'
  • Merge mediawiki-config CR to disable writes to es5 https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/643030
  • Deploy MW change from deploy1001: cd /srv/mediawiki-staging/; git status; git fetch; git rebase; scap sync-file wmf-config/db-eqiad.php "Disable writes to es5 T268469"
  • Check that es5 is read-only (only heartbeat update statements in mysqlbinlog)
  • Stop mariadb on es1024
  • Check kibana to ensure that MW is coping fine.
  • Reboot es1024
  • Start mariadb: systemctl start mariadb
  • Disable read_only: mysql -e "set global read_only = off"
  • Restart prom exporter: systemctl restart prometheus-mysqld-exporter
  • Check that replication is working correctly: sudo -H db-replication-tree es1024
  • Revert MW change
  • Deploy MW revert from deploy1001: cd /srv/mediawiki-staging/; git status; git fetch; git rebase; scap sync-file wmf-config/db-eqiad.php "Re-enable writes to es5 T268469"
  • Check that icinga is all green

Related Objects

StatusSubtypeAssignedTask
ResolvedKormat

Event Timeline

Kormat created this task.Nov 23 2020, 1:03 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptNov 23 2020, 1:03 PM
Kormat updated the task description. (Show Details)Nov 23 2020, 1:03 PM
Kormat updated the task description. (Show Details)

Change 643030 had a related patch set uploaded (by Kormat; owner: Kormat):
[operations/mediawiki-config@master] db-eqiad.php: Depool cluster27 (es5) from writes.

https://gerrit.wikimedia.org/r/643030

Kormat updated the task description. (Show Details)Nov 23 2020, 1:20 PM
Kormat updated the task description. (Show Details)
Kormat updated the task description. (Show Details)Nov 23 2020, 3:47 PM
Marostegui triaged this task as Medium priority.Nov 24 2020, 9:01 AM
Marostegui moved this task from Triage to In progress on the DBA board.

Change 643030 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool cluster27 (es5) from writes.

https://gerrit.wikimedia.org/r/643030

Kormat updated the task description. (Show Details)Nov 25 2020, 8:04 AM

Mentioned in SAL (#wikimedia-operations) [2020-11-25T08:04:56Z] <kormat@deploy1001> Synchronized wmf-config/db-eqiad.php: Disable writes to es5 T268469 (duration: 00m 58s)

Mentioned in SAL (#wikimedia-operations) [2020-11-25T08:07:03Z] <kormat> stopping mariadb on es1024 T268469

Kormat updated the task description. (Show Details)Nov 25 2020, 8:13 AM

Mentioned in SAL (#wikimedia-operations) [2020-11-25T08:14:44Z] <kormat> rebooting es1024 T268469

Kormat updated the task description. (Show Details)Nov 25 2020, 8:33 AM
Kormat updated the task description. (Show Details)Nov 25 2020, 8:40 AM

Mentioned in SAL (#wikimedia-operations) [2020-11-25T08:43:16Z] <kormat@deploy1001> Synchronized wmf-config/db-eqiad.php: Re-enable writes to es5 T268469 (duration: 00m 59s)

Kormat updated the task description. (Show Details)Nov 25 2020, 8:43 AM
Kormat closed this task as Resolved.Nov 25 2020, 9:05 AM
Kormat updated the task description. (Show Details)

Completed.