Page MenuHomePhabricator

Upgrade BIOS and firmware on db2084
Closed, ResolvedPublic

Description

This host has failed to reboot a few times, so it is probably another case of T216240: Reboot, upgrade firmware and kernel of db1096-db1106, db2071-db2092
@Papaul can you get db2084 into the last firmware and BIOS?

Thanks

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptThu, Dec 19, 6:26 AM
Marostegui triaged this task as Medium priority.Thu, Dec 19, 6:27 AM
Marostegui moved this task from Triage to Blocked external/Not db team on the DBA board.

Mentioned in SAL (#wikimedia-operations) [2019-12-19T09:41:36Z] <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2084:3314, db2084:3315 T241103', diff saved to https://phabricator.wikimedia.org/P9966 and previous config saved to /var/cache/conftool/dbconfig/20191219-094135-marostegui.json

I have depooled this host.
So before acting on it we just need to stop downtime + stop MySQL

Papaul reassigned this task from Papaul to jcrespo.Fri, Jan 3, 5:18 PM

Before
BIOS Version
2.4.3
Firmware Version
2.40.40.40
After
BIOS Version
2.11.0
Firmware Version
2.70.70.70

Upgrade complete

I tried to start MYSQL with "systemctl start mariadb" but it is not starting

journalctl -u mariadb
-- Logs begin at Fri 2020-01-03 16:49:53 UTC, end at Fri 2020-01-03 17:10:44 U
Jan 03 17:06:56 db2084 systemd[1]: [/lib/systemd/system/mariadb.service:71] Un
Jan 03 17:07:07 db2084 systemd[1]: [/lib/systemd/system/mariadb.service:71] Un
Jan 03 17:07:07 db2084 systemd[1]: Starting mariadb database server...
Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Control process exited, co
Jan 03 17:07:07 db2084 systemd[1]: Failed to start mariadb database server.
Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Unit entered failed state.
Jan 03 17:07:07 db2084 systemd[1]: mariadb.service: Failed with result 'exit-

I am leaving it to you. Thanks

Mentioned in SAL (#wikimedia-operations) [2020-01-03T17:44:48Z] <jynus@cumin1001> dbctl commit (dc=all): 'Repool db2084 instances T241103', diff saved to https://phabricator.wikimedia.org/P10035 and previous config saved to /var/cache/conftool/dbconfig/20200103-174447-jynus.json

jcrespo reassigned this task from jcrespo to Marostegui.Fri, Jan 3, 5:54 PM
jcrespo added a subscriber: jcrespo.

I have started mysql instances back again, and replication, as on codfw there is low load.

I saw to minor issues @Marostegui , not important in this case, but FYI, specially on new hosts.

  • mariadb.service was enabled, I disabled it and reload failed units. Not sure if the package changed or it is something we should do on new multi-instance hosts (or fix on the package)
  • the new package was compiled with the buster-only create temp dir, failing non-fatally on restart. That option was only intended for buster packages, could cause issues on reimages of non-buster hosts (but I would not change for existing hosts).

Assigning to you for follow up/closing/etc.

Marostegui closed this task as Resolved.Tue, Jan 7, 6:56 AM
Marostegui reassigned this task from Marostegui to Papaul.

I have started mysql instances back again, and replication, as on codfw there is low load.
I saw to minor issues @Marostegui , not important in this case, but FYI, specially on new hosts.

  • mariadb.service was enabled, I disabled it and reload failed units. Not sure if the package changed or it is something we should do on new multi-instance hosts (or fix on the package)
  • the new package was compiled with the buster-only create temp dir, failing non-fatally on restart. That option was only intended for buster packages, could cause issues on reimages of non-buster hosts (but I would not change for existing hosts).

Assigning to you for follow up/closing/etc.

Thanks for catching this. I am not sure what could have happened. I have not compiled 10.1 packages, and I do not recall any conversation regarding changes to the unit changing, so maybe a one time thing on db2084 or something manual was enabled there?
Probably not worth investigating/changing 10.1 packages anymore, as the idea would just be to keep compiling the next 10.1 but with the goal of migrating to 10.3/10.4.

Thanks guys for handling this!