It came up in some investigation of weird CPU0 load issues on lvs1014 that not all of our Broadcom NICs in use on LVSen have the same firmware version loaded -- see P9437. Firmware issues have been a problem in the past: T203194#4880083 and onwards.
It'd be nice to be tracking what versions are in use where, ideally with some temporal history as well. Two options:
- As a custom Puppet fact, likely extending the current net_driver custom fact we export already. Does not seem too hard to also have that Ruby invoke ethtool -i and read the output, probably just copying the firmware-version: line into a firmware_version key.
- As a Prometheus metric, likely exported via a textfile exporter invoked by a systemd timer. The metric value would always be just 1 and the labels would specify all the data -- it'd look something like nic_firmware_version{instance="lvs1001:9xxx",interface="enp4s0f0",driver="bnx2x",firmware_version="FFV14.10.07 bc 7.14.11"} 1, in the same style as is recommended for exporting software versions.
Possibly we'd want to do both?