Page MenuHomePhabricator

DB: perform rolling restart of mariadb daemons to pick up CA changes
Open, MediumPublic

Description

This ticket is to track progress of the mariadb restarts
The current certificate expires Jun 29 19:36:29 2020 GMT

  • es1012: es1
  • es1016: es1
  • es1018: es1
  • es2011: es1
  • es2012: es1
  • es2013: es1
  • es1011: es2
  • es1013: es2
  • es1015: es2
  • es2014: es2
  • es2015: es2
  • es2016: es2
  • es1014: es3
  • es1017: es3
  • es1019: es3
  • es2017: es3
  • es2018: es3
  • es2019: es3
  • db1117:3321: m1
  • db1135: m1
  • db2078:3321: m1
  • db2132: m1
  • db1117:3322: m2
  • db1132: m2
  • db2078:3322: m2
  • db2133: m2
  • db1117:3323: m3
  • db1128: m3
  • db2078:3323: m3
  • db2134: m3
  • db1107: test-s1
  • db1108: m4
  • db1117:3325: m5
  • db1133: m5
  • db2078:3325: m5
  • db2135: m5
  • pc1007: pc1
  • pc1010: pc1
  • pc2007: pc1
  • pc2010: pc1
  • pc1008: pc2
  • pc2008: pc2
  • pc1009: pc3
  • pc2009: pc3
  • db1080: s1
  • db1083: s1
  • db1089: s1
  • db1099:3311: s1
  • db1105:3311: s1
  • db1106: s1
  • db1118: s1
  • db1119: s1
  • db1107:s1
  • db1114:s1
  • db1124:3311: s1
  • db1134: s1
  • db1139:3311: s1
  • db2071: s1
  • db2072: s1
  • db2085:3311: s1
  • db2088:3311: s1
  • db2092: s1
  • db2094:3311: s1
  • db2097:3311: s1
  • db2103: s1
  • db2112: s1
  • db2116: s1
  • db2130: s1
  • dbstore1003:3311: s1
  • labsdb1009: s1
  • labsdb1010: s1
  • labsdb1011: s1
  • labsdb1012: s1
  • db1074: s2
  • db1076: s2
  • db1090:3312: s2
  • db1095:3312: s2
  • db1103:3312: s2
  • db1105:3312: s2
  • db1122: s2
  • db1125:3312: s2
  • db1129: s2
  • db2088:3312: s2
  • db2091:3312: s2
  • db2095:3312: s2
  • db2098:3312: s2
  • db2104: s2
  • db2107: s2
  • db2108: s2
  • db2125: s2
  • db2126: s2
  • dbstore1004:3312: s2
  • labsdb1009: s2
  • labsdb1010: s2
  • labsdb1011: s2
  • labsdb1012: s2
  • db1075: s3
  • db1078: s3
  • db1095:3313: s3
  • db1112: s3
  • db1123: s3
  • db1124:3313: s3
  • db2074: s3
  • db2094:3313: s3
  • db2098:3313: s3
  • db2105: s3
  • db2109: s3
  • db2127: s3
  • dbstore1004:3313: s3
  • labsdb1009: s3
  • labsdb1010: s3
  • labsdb1011: s3
  • labsdb1012: s3
  • db1081: s4
  • db1084: s4
  • db1091: s4
  • db1097:3314: s4
  • db1102:3314: s4
  • db1103:3314: s4
  • db1121: s4
  • db1125:3314: s4
  • db1138: s4
  • db2073: s4
  • db2084:3314: s4
  • db2090: s4
  • db2091:3314: s4
  • db2095:3314: s4
  • db2099:3314: s4
  • db2106: s4
  • db2110: s4
  • db2119: s4
  • dbstore1004:3314: s4
  • labsdb1009: s4
  • labsdb1010: s4
  • labsdb1011: s4
  • labsdb1012: s4
  • db1082: s5
  • db1096:3315: s5
  • db1097:3315: s5
  • db1100: s5
  • db1102:3315: s5
  • db1110: s5
  • db1113:3315: s5
  • db1124:3315: s5
  • db1130: s5
  • db2075: s5
  • db2084:3315: s5
  • db2089:3315: s5
  • db2094:3315: s5
  • db2099:3315: s5
  • db2111: s5
  • db2113: s5
  • db2123: s5
  • db2128: s5
  • dbstore1003:3315: s5
  • labsdb1009: s5
  • labsdb1010: s5
  • labsdb1011: s5
  • labsdb1012: s5
  • db1085: s6
  • db1088: s6
  • db1093: s6
  • db1096:3316: s6
  • db1098:3316: s6
  • db1113:3316: s6
  • db1125:3316: s6
  • db1131: s6
  • db1139:3316: s6
  • db2076: s6
  • db2087:3316: s6
  • db2089:3316: s6
  • db2095:3316: s6
  • db2097:3316: s6
  • db2114: s6
  • db2117: s6
  • db2124: s6
  • db2129: s6
  • dbstore1005:3316: s6
  • labsdb1009: s6
  • labsdb1010: s6
  • labsdb1011: s6
  • labsdb1012: s6
  • db1079: s7
  • db1086: s7
  • db1090:3317: s7
  • db1094: s7
  • db1098:3317: s7
  • db1101:3317: s7
  • db1116:3317: s7
  • db1125:3317: s7
  • db1136: s7
  • db2077: s7
  • db2086:3317: s7
  • db2087:3317: s7
  • db2095:3317: s7
  • db2100:3317: s7
  • db2118: s7
  • db2120: s7
  • db2121: s7
  • db2122: s7
  • dbstore1003:3317: s7
  • labsdb1009: s7
  • labsdb1010: s7
  • labsdb1011: s7
  • labsdb1012: s7
  • db1087: s8
  • db1092: s8
  • db1099:3318: s8
  • db1101:3318: s8
  • db1104: s8
  • db1109: s8
  • db1116:3318: s8
  • db1124:3318: s8
  • db1126: s8
  • db2079: s8
  • db2080: s8
  • db2081: s8
  • db2082: s8
  • db2083: s8
  • db2085:3318: s8
  • db2086:3318: s8
  • db2094:3318: s8
  • db2100:3318: s8
  • dbstore1005:3318: s8
  • labsdb1009: s8
  • labsdb1010: s8
  • labsdb1011: s8
  • labsdb1012: s8
  • dbstore1005:3350: staging
  • db1115: tendril
  • db2093: tendril
  • db1114: test-s1
  • db2102: test-s1
  • db1077: test-s4
  • db1111: test-s4
  • db1120: x1
  • db1127: x1
  • db1137: x1
  • db1140:3320: x1
  • db2096: x1
  • db2101:3320: x1
  • db2115: x1
  • db2131: x1
  • dbstore1005:3320: x1

Query: mysql.py -hdb1115 zarcillo -e "select instances.name ,section_instances.section from instances join section_instances on instances.name = section_instances.instance order by section_instances.section;

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Marostegui updated the task description. (Show Details)Jan 7 2020, 7:31 AM
Marostegui updated the task description. (Show Details)Jan 7 2020, 8:23 AM
Marostegui updated the task description. (Show Details)Jan 7 2020, 9:35 AM
Marostegui updated the task description. (Show Details)Jan 7 2020, 1:56 PM
Marostegui updated the task description. (Show Details)Jan 7 2020, 3:07 PM
Marostegui updated the task description. (Show Details)Jan 8 2020, 6:58 AM
Marostegui updated the task description. (Show Details)Jan 8 2020, 8:12 AM
Marostegui updated the task description. (Show Details)Jan 9 2020, 7:39 AM
Marostegui updated the task description. (Show Details)Jan 9 2020, 9:49 AM
Marostegui updated the task description. (Show Details)Jan 9 2020, 3:08 PM

@Andrew @Bstorm who in WMCS would be responsible for restarting mysql on these hosts?

labservices1001
labservices1002
labtestservices2001

Thank you!

Those boxes don't exist anymore, at least by that name. Instead we have: cloudservices1003, cloudservices1004, cloudservices2002-dev, clouddb2001-dev. I've run 'service mariadb restart' on all of the above.

But, I'm concerned that those old hosts showed up in whatever query you ran to generate the list above; do we need to upgrade some records someplace?

Mentioned in SAL (#wikimedia-operations) [2020-01-13T02:02:08Z] <andrewbogott> restarted mariadb on cloudservices1003, cloudservices1004, cloudservices2001-dev, clouddb2001-dev for T239791

@Andrew @Bstorm who in WMCS would be responsible for restarting mysql on these hosts?

labservices1001
labservices1002
labtestservices2001

Thank you!

Those boxes don't exist anymore, at least by that name. Instead we have: cloudservices1003, cloudservices1004, cloudservices2002-dev, clouddb2001-dev. I've run 'service mariadb restart' on all of the above.
But, I'm concerned that those old hosts showed up in whatever query you ran to generate the list above; do we need to upgrade some records someplace?

Thanks Andrew!
Don't worry, those are our databases lists, which are the moment are done manually. We have some checklists to make sure we include/remove them as we install or decommission them, but not for the WMCS hosts. We have yet, pending, to be able to populate zarcillo/tendril (our source of truth) automatically.

Marostegui updated the task description. (Show Details)Jan 13 2020, 5:49 AM
Marostegui updated the task description. (Show Details)Jan 13 2020, 6:51 AM
Marostegui updated the task description. (Show Details)Jan 14 2020, 2:00 PM
Marostegui updated the task description. (Show Details)Jan 15 2020, 6:18 AM
Marostegui updated the task description. (Show Details)Jan 15 2020, 6:28 AM
Marostegui updated the task description. (Show Details)Jan 16 2020, 7:24 AM
Marostegui updated the task description. (Show Details)Jan 16 2020, 1:40 PM
Marostegui updated the task description. (Show Details)Jan 17 2020, 7:52 AM
Marostegui updated the task description. (Show Details)Jan 20 2020, 8:49 AM
Marostegui updated the task description. (Show Details)Jan 20 2020, 8:59 AM
Marostegui updated the task description. (Show Details)Jan 20 2020, 9:18 AM
jbond moved this task from Unsorted 💣 to Watching 👀 on the User-jbond board.Jan 20 2020, 1:24 PM

Mentioned in SAL (#wikimedia-operations) [2020-02-04T09:07:45Z] <marostegui> Upgrade s3 codfw master db2105 - T239791

Marostegui updated the task description. (Show Details)Feb 4 2020, 9:35 AM
Marostegui renamed this task from DB: perform rolling restart of mariadb deamons to pick up CA changes to DB: perform rolling restart of mariadb daemons to pick up CA changes.Feb 7 2020, 8:57 AM
Marostegui updated the task description. (Show Details)

Mentioned in SAL (#wikimedia-operations) [2020-02-11T07:43:59Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1013 - T239791', diff saved to https://phabricator.wikimedia.org/P10374 and previous config saved to /var/cache/conftool/dbconfig/20200211-074358-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-02-11T07:47:40Z] <marostegui> Upgrade es1013 - T239791

Marostegui updated the task description. (Show Details)Feb 11 2020, 7:51 AM
Marostegui updated the task description. (Show Details)Feb 11 2020, 8:35 AM
Marostegui updated the task description. (Show Details)Feb 13 2020, 6:43 AM

Mentioned in SAL (#wikimedia-operations) [2020-03-02T09:19:47Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1119 T239791', diff saved to https://phabricator.wikimedia.org/P10569 and previous config saved to /var/cache/conftool/dbconfig/20200302-091947-marostegui.json

Marostegui updated the task description. (Show Details)Mon, Mar 2, 9:26 AM

Mentioned in SAL (#wikimedia-operations) [2020-03-02T09:27:44Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1119 after upgrade T239791', diff saved to https://phabricator.wikimedia.org/P10570 and previous config saved to /var/cache/conftool/dbconfig/20200302-092743-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-02T09:38:49Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1119 after upgrade T239791', diff saved to https://phabricator.wikimedia.org/P10572 and previous config saved to /var/cache/conftool/dbconfig/20200302-093848-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-02T09:46:34Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1119 after upgrade T239791', diff saved to https://phabricator.wikimedia.org/P10573 and previous config saved to /var/cache/conftool/dbconfig/20200302-094633-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-02T09:58:43Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1119 after upgrade T239791', diff saved to https://phabricator.wikimedia.org/P10574 and previous config saved to /var/cache/conftool/dbconfig/20200302-095841-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-06T08:44:47Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1113:3315, db1113:3316 for upgrade - T239791', diff saved to https://phabricator.wikimedia.org/P10640 and previous config saved to /var/cache/conftool/dbconfig/20200306-084439-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-06T08:47:51Z] <marostegui> Stop mysql for db1113:3315, db1113:3316 for upgrade T239791

Marostegui updated the task description. (Show Details)Fri, Mar 6, 8:52 AM

Mentioned in SAL (#wikimedia-operations) [2020-03-06T08:53:33Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1113:3315, db1113:3316 after upgrade - T239791', diff saved to https://phabricator.wikimedia.org/P10641 and previous config saved to /var/cache/conftool/dbconfig/20200306-085332-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-06T08:56:37Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1074 for upgrade T239791', diff saved to https://phabricator.wikimedia.org/P10642 and previous config saved to /var/cache/conftool/dbconfig/20200306-085435-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-06T08:56:41Z] <marostegui> Stop MySQL on db1074 for upgrade T239791

Marostegui updated the task description. (Show Details)Fri, Mar 6, 8:57 AM

Mentioned in SAL (#wikimedia-operations) [2020-03-09T14:47:52Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1121 T239791', diff saved to https://phabricator.wikimedia.org/P10662 and previous config saved to /var/cache/conftool/dbconfig/20200309-144752-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-09T14:48:24Z] <marostegui> Restart and upgrade mysql on db1121 T239791

Marostegui updated the task description. (Show Details)Mon, Mar 9, 2:51 PM

Mentioned in SAL (#wikimedia-operations) [2020-03-09T14:52:32Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1121 T239791', diff saved to https://phabricator.wikimedia.org/P10663 and previous config saved to /var/cache/conftool/dbconfig/20200309-145232-marostegui.json

Marostegui updated the task description. (Show Details)Mon, Mar 9, 3:04 PM

Mentioned in SAL (#wikimedia-operations) [2020-03-09T15:13:11Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote es1016 to es1 master, this is a NOOP T239791', diff saved to https://phabricator.wikimedia.org/P10664 and previous config saved to /var/cache/conftool/dbconfig/20200309-151310-marostegui.json

Marostegui updated the task description. (Show Details)Mon, Mar 9, 3:17 PM

Mentioned in SAL (#wikimedia-operations) [2020-03-09T15:17:51Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1012 T239791', diff saved to https://phabricator.wikimedia.org/P10665 and previous config saved to /var/cache/conftool/dbconfig/20200309-151751-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-09T15:29:23Z] <marostegui> Upgrade mysql on es1012 T239791

Marostegui updated the task description. (Show Details)Mon, Mar 9, 3:31 PM

Mentioned in SAL (#wikimedia-operations) [2020-03-10T08:25:53Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote es1012 back to es1 master, this is a NOOP T239791', diff saved to https://phabricator.wikimedia.org/P10671 and previous config saved to /var/cache/conftool/dbconfig/20200310-082552-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-16T09:30:49Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote es1011 to es2 master, this is a NOOP T239791', diff saved to https://phabricator.wikimedia.org/P10700 and previous config saved to /var/cache/conftool/dbconfig/20200316-093048-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-16T09:32:29Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1015 for upgrade and restart T239791', diff saved to https://phabricator.wikimedia.org/P10701 and previous config saved to /var/cache/conftool/dbconfig/20200316-093228-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-16T10:10:57Z] <marostegui> Stop mysql for upgrade on es1015 T239791

Marostegui updated the task description. (Show Details)Mon, Mar 16, 10:12 AM

Mentioned in SAL (#wikimedia-operations) [2020-03-20T07:09:23Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Promote es1014 to es3 master, this is a NOOP T239791', diff saved to https://phabricator.wikimedia.org/P10734 and previous config saved to /var/cache/conftool/dbconfig/20200320-070922-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-20T07:09:46Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1017 for update T239791', diff saved to https://phabricator.wikimedia.org/P10735 and previous config saved to /var/cache/conftool/dbconfig/20200320-070945-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2020-03-20T07:26:06Z] <marostegui> Restart mysql on es1017 for upgrade - T239791

Marostegui updated the task description. (Show Details)Fri, Mar 20, 7:30 AM

Only s1-s8 and x1 masters pending.