Page MenuHomePhabricator

determine/process/document bios firmware tracking/updating policies
Open, MediumPublic

Description

This task was partially generated due to install/firmware issues during the setup of T139171.

The firmware versions of our fleet are not tracked in any central location. We also only update firmware as problem arise. After some brief discussion in IRC, this task was created to better track discussion/ideas/etc.

Points to review:

  • How often should we update the firmware of bios/controllers/ilom?
  • Should we always update the firmware at the time of receiving the servers?
    • Both HP and Dell are guilty of sending us hardware with firmware that is multiple revisions out of date.
  • How can we poll the systems for firmware revisions and where to best record/track/display this information?

Event Timeline

Should we always update the firmware at the time of receiving the servers?

Personally, I probably would, If there are other pre-existing servers of the same model they can act like a test box.

How can we poll the systems for firmware revisions and where to best record/track/display this information?

It's been a few years since i've played with it but Dell OpenManage can handle that iirc (Despite its name, it's not opensource), There appears to a couple of Incinga plugins as well to go with (But I haven't checked which features they cover)

RobH closed subtask Restricted Task as Invalid.

I'll take this as action item to discuss during our next staff meeting. I gave our Dell account rep a call today inquiring about when the latest firmware/bios upgrades get flashed before new hardware is shipped out to us. He'll follow up with one of their sys admins and get back to me this week.

Thanks,
Willy

Update - per Dell, there's up to a 30-day delay with the factory approved bios/firmware upgrades from the time that they're posted on the web. So some of the recent bios upgrades we had to perform (like on the cp servers) most likely fell into this bucket.

Chatted with @Volans yesterday for a little bit on best way we should approach doing firmware upgrades going forward. My preference is that service owners have ownership and the ability to do it remotely, since it would eliminate a lot of the back and forth coordination that would have to happen, if dc-ops owned it. Because reboots are not required for Dells (they are for HPs), Riccardo had a good suggestion that we could potentially combine firmware upgrades along with the kernel upgrades. Dell has a tool that we could potentially use, which may have improved in recent years, so I'll set something up with them to provide us a demo in the mean time. Thanks, Willy

Demo for Dell's System Management Tool set up for next Monday on June 8, to evaluate if it's something we want to use going forward or if it's something the Infra Foundations can use as a blueprint internally for firmware/bios upgrades.

Demo for Dell's System Management Tool set up for next Monday on June 8, to evaluate if it's something we want to use going forward or if it's something the Infra Foundations can use as a blueprint internally for firmware/bios upgrades.

Please include me in this, as last time we evaluated this it didn't meet open source OS requirements (It used to require a windows server at some point in the network). =]

Sure, no problem @RobH . I just asked Paul to add you to the invite.

Please include me in this, as last time we evaluated this it didn't meet open source OS requirements (It used to require a windows server at some point in the network). =]

Marostegui closed subtask Restricted Task as Resolved.May 25 2021, 9:41 AM
Aklapper added a subscriber: wiki_willy.

Removing task assignee due to inactivity as this open task has been assigned for more than two years. See the email sent to the task assignee on August 22nd, 2022.
Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome!
If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".
Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator. Thanks!