This host has failed to reboot a few times, so it is probably another case of T216240: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092
@Papaul can you get db2084 into the last firmware and BIOS?
Thanks
This host has failed to reboot a few times, so it is probably another case of T216240: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092
@Papaul can you get db2084 into the last firmware and BIOS?
Thanks
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Declined | None | T216240 Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092 | |||
Resolved | Papaul | T241103 Upgrade BIOS and firmware on db2084 |
Mentioned in SAL (#wikimedia-operations) [2019-12-19T09:41:36Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2084:3314, db2084:3315 T241103', diff saved to https://phabricator.wikimedia.org/P9966 and previous config saved to /var/cache/conftool/dbconfig/20191219-094135-marostegui.json
I have depooled this host.
So before acting on it we just need to stop downtime + stop MySQL
Before
BIOS Version
2.4.3
Firmware Version
2.40.40.40
After
BIOS Version
2.11.0
Firmware Version
2.70.70.70
Upgrade complete
I tried to start MYSQL with "systemctl start mariadb" but it is not starting
journalctl -u mariadb -- Logs begin at Fri 2020-01-03 16:49:53 UTC, end at Fri 2020-01-03 17:10:44 U Jan 03 17:06:56 db2084 systemd[1]: [/lib/systemd/system/mariadb.service:71] Un Jan 03 17:07:07 db2084 systemd[1]: [/lib/systemd/system/mariadb.service:71] Un Jan 03 17:07:07 db2084 systemd[1]: Starting mariadb database server... Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Control process exited, co Jan 03 17:07:07 db2084 systemd[1]: Failed to start mariadb database server. Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Unit entered failed state. Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Failed with result 'exit-
I am leaving it to you. Thanks
Mentioned in SAL (#wikimedia-operations) [2020-01-03T17:44:48Z] <jynus@cumin1001> dbctl commit (dc=all): 'Repool db2084 instances T241103', diff saved to https://phabricator.wikimedia.org/P10035 and previous config saved to /var/cache/conftool/dbconfig/20200103-174447-jynus.json
I have started mysql instances back again, and replication, as on codfw there is low load.
I saw to minor issues @Marostegui , not important in this case, but FYI, specially on new hosts.
Assigning to you for follow up/closing/etc.
Thanks for catching this. I am not sure what could have happened. I have not compiled 10.1 packages, and I do not recall any conversation regarding changes to the unit changing, so maybe a one time thing on db2084 or something manual was enabled there?
Probably not worth investigating/changing 10.1 packages anymore, as the idea would just be to keep compiling the next 10.1 but with the goal of migrating to 10.3/10.4.
Thanks guys for handling this!