Page MenuHomePhabricator

Enable report_host for mariadb
Closed, ResolvedPublic

Description

By default mariadb does not report a hostname when it starts replicating from another server. This means the output of show slave status only contains very opaque information:

root@pc2007.codfw.wmnet[(none)]> show slave hosts;
+-----------+------+------+-----------+
| Server_id | Host | Port | Master_id |
+-----------+------+------+-----------+
| 171966644 |      | 3306 | 180355176 |
| 180367374 |      | 3306 | 180355176 |
+-----------+------+------+-----------+

If we use puppet to set report_host to the fqdn, then it will show up in the show slave hosts table, as well as be available for orchestrator to query.

NOTE: This requires MySQL daemon to be restarted
NOTE: No need to enable this on labsdb* hosts, those are not stable and they are being ignored on the discovery: https://gerrit.wikimedia.org/r/c/operations/puppet/+/660839/

https://mariadb.com/kb/en/replication-and-binary-log-system-variables/#report_host

Scripts used for rolling this out:

Restarting hosts progress

Related Objects

StatusSubtypeAssignedTask
ResolvedLegoktm
ResolvedMarostegui
ResolvedMarostegui
Resolved Kormat
ResolvedMarostegui
ResolvedTrizek-WMF
Resolved Kormat
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
ResolvedMarostegui
Resolvedsgrabarczuk
ResolvedMarostegui
Resolvedsgrabarczuk
Resolved Cmjohnson
ResolvedMarostegui
ResolvedMarostegui
Resolvedsgrabarczuk
ResolvedRequest Cmjohnson
ResolvedMarostegui
ResolvedRequestwiki_willy
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson
ResolvedRequest Cmjohnson

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

s4 progress:

labsdb1012.eqiad.wmnet:3306
labsdb1011.eqiad.wmnet:3306
labsdb1010.eqiad.wmnet:3306
labsdb1009.eqiad.wmnet:3306

  • dbstore1004.eqiad.wmnet:3314
  • db1155.eqiad.wmnet:3314
  • db1150.eqiad.wmnet:3314
  • db1149.eqiad.wmnet:3306
  • db1148.eqiad.wmnet:3306
  • db1147.eqiad.wmnet:3306
  • db1146.eqiad.wmnet:3314
  • db1145.eqiad.wmnet:3314
  • db1144.eqiad.wmnet:3314
  • db1143.eqiad.wmnet:3306
  • db1142.eqiad.wmnet:3306
  • db1141.eqiad.wmnet:3306
  • db1138.eqiad.wmnet:3306
  • db1125.eqiad.wmnet:3314
  • db1121.eqiad.wmnet:3306
  • db1081.eqiad.wmnet:3306
  • clouddb1019.eqiad.wmnet:3314
  • clouddb1015.eqiad.wmnet:3314

Change 659216 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/mediawiki-config@master] db-eqiad.php: Depool es5 from writes

https://gerrit.wikimedia.org/r/659216

Change 659216 merged by jenkins-bot:
[operations/mediawiki-config@master] db-eqiad.php: Depool es5 from writes

https://gerrit.wikimedia.org/r/659216

Mentioned in SAL (#wikimedia-operations) [2021-01-28T10:46:16Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool es5 from writes T266483 (duration: 01m 09s)

Mentioned in SAL (#wikimedia-operations) [2021-01-28T10:52:14Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Repool es5 on writes T266483 (duration: 01m 05s)

Mentioned in SAL (#wikimedia-operations) [2021-01-28T11:03:54Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1025 T266483', diff saved to https://phabricator.wikimedia.org/P14018 and previous config saved to /var/cache/conftool/dbconfig/20210128-110353-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-02-01T09:45:49Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Depool es4 from writes T266483 (duration: 01m 04s)

Mentioned in SAL (#wikimedia-operations) [2021-02-01T09:46:04Z] <marostegui> Restart mysql on es1021 T266483

Mentioned in SAL (#wikimedia-operations) [2021-02-01T09:52:52Z] <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Repool es4 into writes T266483 (duration: 00m 56s)

Change 660839 had a related patch set uploaded (by Marostegui; owner: Marostegui):
[operations/puppet@production] orchestrator.conf: Do not discover labsdb* hosts

https://gerrit.wikimedia.org/r/660839

Change 660839 merged by Marostegui:
[operations/puppet@production] orchestrator.conf: Do not discover labsdb* hosts

https://gerrit.wikimedia.org/r/660839

Mentioned in SAL (#wikimedia-operations) [2021-02-01T14:39:25Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1147 T266483', diff saved to https://phabricator.wikimedia.org/P14104 and previous config saved to /var/cache/conftool/dbconfig/20210201-143925-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-02-01T14:40:03Z] <marostegui> Restart mysql on db1147 T266483

Mentioned in SAL (#wikimedia-operations) [2021-02-02T06:23:04Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1022 T266483', diff saved to https://phabricator.wikimedia.org/P14113 and previous config saved to /var/cache/conftool/dbconfig/20210202-062303-marostegui.json

x1:

  • dbstore1005
  • db1137
  • db1120
  • db1103
  • db1102

Mentioned in SAL (#wikimedia-operations) [2021-02-03T13:29:38Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T266483', diff saved to https://phabricator.wikimedia.org/P14164 and previous config saved to /var/cache/conftool/dbconfig/20210203-132938-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-02-03T13:30:26Z] <marostegui> Stop mysql on db1120 to enable report_host T266483

Mentioned in SAL (#wikimedia-operations) [2021-02-04T06:41:58Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1137 T266483', diff saved to https://phabricator.wikimedia.org/P14181 and previous config saved to /var/cache/conftool/dbconfig/20210204-064157-marostegui.json

s7 eqiad

  • labsdb1012 - not needed
  • labsdb1011 - not needed
  • labsdb1010 - not needed
  • labsdb1009 - not needed
  • dbstore1003
  • db1174
  • db1170
  • db1155
  • db1136
  • db1127
  • db1125
  • db1116
  • db1101
  • db1098
  • db1090
  • db1086
  • db1079
  • clouddb1018
  • clouddb1014

Mentioned in SAL (#wikimedia-operations) [2021-02-10T08:05:12Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1127 T266483', diff saved to https://phabricator.wikimedia.org/P14283 and previous config saved to /var/cache/conftool/dbconfig/20210210-080512-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-03-23T06:58:36Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1101:3317 to enable report_host T266483', diff saved to https://phabricator.wikimedia.org/P15002 and previous config saved to /var/cache/conftool/dbconfig/20210323-065836-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-03-23T06:59:48Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1101:3318 to enable report_host T266483', diff saved to https://phabricator.wikimedia.org/P15003 and previous config saved to /var/cache/conftool/dbconfig/20210323-065947-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2021-03-23T07:52:53Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1098:3317 to enable report_host T266483', diff saved to https://phabricator.wikimedia.org/P15016 and previous config saved to /var/cache/conftool/dbconfig/20210323-075253-marostegui.json

With the crash that happened on labsdb1009 a couple of weeks ago, report_host is now enabled there. It doesn't really matter as we've excluded labsdb* hosts from orchestrator.

  • labsdb1011 not needed
  • labsdb1010 not needed
  • labsdb1009 not needed
  • dbstore1003
  • db1184
  • db1169
  • db1164
  • db1163
  • db1154
  • db1140
  • db1139
  • db1135
  • db1134
  • db1133
  • db1119
  • db1118
  • db1106
  • db1105
  • db1099
  • db1083
  • clouddb1021
  • clouddb1017
  • clouddb1013

s1 is fully done only pending the master (T278214)

s1 is fully done only pending the master (T278214)

done

s2 is fully done apart from the master (db1122)

s5 is fully done apart from the master (db1100)

s3 is fully done apart from the master (db1123)

Mentioned in SAL (#wikimedia-operations) [2021-04-30T05:15:59Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1114 to enable report_host T266483', diff saved to https://phabricator.wikimedia.org/P15663 and previous config saved to /var/cache/conftool/dbconfig/20210430-051558-marostegui.json

s8 eqiad:

  • labsdb1011 (not needed)
  • labsdb1010 (not needed)
  • labsdb1009 (not needed)
  • dbstore1005
  • db1177
  • db1172
  • db1167
  • db1154
  • db1126
  • db1116
  • db1114
  • db1111
  • db1109
  • db1104
  • db1101
  • db1099
  • db1087
  • clouddb1021
  • clouddb1020
  • clouddb1016

s8 is fully done apart from the master (db1104)

Mentioned in SAL (#wikimedia-operations) [2021-05-17T06:01:32Z] <kormat> restarting mariadb on db1131 to pick up report_host T266483

Pending eqiad hosts:
[x] db1129 (slave)
[x] db1122
[x] db1104

Marostegui claimed this task.

This is all done and all hosts are now in Orchestrator