Page MenuHomePhabricator

redfish: minimum version support
Open, MediumPublic

Description

currently we have a lot of different versions of idrac installed which has meant we have had to add some hacks to the redfish module in spicerack to handle different redfish versions. As such it would be nice to agree upon and work towards up upgrading the idrac version to some minimum.

I have done a bit of research and there seems to be two versions that are worth highlighting as milestones in the upgrade path

  • 3.30.30.30: adds HttpPushUri, anything below this requires a manual upgrade to this before progressing
    • This is the first version which is supported by the firmware upgrade cookbook
    • you must upgrade to this version before progressing to anything larger (i.e. you cant go straight to 6.*)
    • once on this version it appears that you can upgrade to the most recent version, using the current cookbook
    • unfortunately the supermicro update serviette does not support HttpPushUri
  • 4.40.00.00: Introduces MultipartHttpPushUri which is preferred
    • This allows for a nicer interface for uploading firmware files
    • This method is supported by super-micro as well
  • 6.10.0.0: The current idrac version

please not the above numbers relate to the generation 14 (idrac9), ill need to look separately at the gen 13 (idrac8) cards.

Ideally i think it would be nice if we can get everything up to the current version 6.*, however as a priority i think it would be nice if we can upgrade everything that with 3.30.30.30 <= version < 4.40.0.0. this would allow us to at the very least migrate current cookbooks to all use the MultipartHttpPushUri API.

Anything below 3.30.30.30 would need a manual upgrade to 3.30.30.30 so i think its reasonable to say for those machines you have to do two manual
updates before redfish is supported i.e. $current -> 3.30.30.30, 30.30.30.30 -> $latest

The following is a table indicating which hosts still need a manual update and which hosts can be upgraded automatically but are unsupported

Hosts that require Manual upgrade (260):
2.20.20.20 (1)

stat1004

2.30.30.30 (5)

labstore1004, puppetmaster1001, puppetmaster1002, thumbor1001, thumbor1002

2.40.40.40 (45)

analytics1058, analytics1059, analytics1060, analytics1061, analytics1062, analytics1063,
analytics1064, analytics1065, analytics1066, analytics1067, analytics1069, druid1004, druid1005,
druid1006, dumpsdata1001, dumpsdata1002, flerovium, furud, kafka-jumbo1001, kafka-jumbo1002,
kafka-jumbo1003, kafka-jumbo1004, kafka-jumbo1005, krb2001, ores1001, ores1002, ores1003, ores1004,
ores1005, ores1006, ores1007, ores1008, ores1009, ores2001, ores2002, ores2003, ores2004, ores2006,
ores2007, ores2008, ores2009, restbase1017, restbase1018, restbase2012, wdqs1005

2.43.43.43 (1)

stat1005

2.50.50.50 (94)

an-coord1002, an-launcher1002, an-presto1001, an-presto1002, an-presto1003, an-presto1005,
an-worker1078, an-worker1079, an-worker1080, an-worker1081, an-worker1082, an-worker1083,
an-worker1084, an-worker1085, an-worker1086, an-worker1087, an-worker1088, an-worker1089,
an-worker1090, an-worker1091, an-worker1092, an-worker1093, an-worker1094, an-worker1095,
analytics1068, analytics1070, analytics1071, analytics1072, analytics1073, analytics1074,
analytics1075, analytics1076, analytics1077, cloudelastic1001, cloudelastic1002, cloudelastic1004,
db1108, db1110, db1111, db1112, db1113, db1114, db1115, db1117, db1118, db1119, db1121, db1122,
db1123, lvs1014, ms-be1040, ms-be1041, ms-be1042, ms-be1043, ms-be2041, ms-be2042, ms-be2043,
mw2259, mw2260, mw2261, mw2262, mw2263, mw2265, mw2266, mw2267, mw2268, mw2269, mw2270, mw2271,
mw2272, mw2273, mw2274, mw2275, mw2276, mw2277, mw2278, mw2279, mw2281, mw2282, mw2283, mw2284,
mw2285, mw2286, mw2287, mw2288, mw2289, mw2290, wdqs1004, wdqs1006, wdqs1007, wdqs1008, wdqs1010,
wdqs2004, wdqs2006

2.52.52.52 (1)

labstore1005

2.61.60.60 (1)

restbase1016

2.63.60.61 (2)

cloudelastic1003, puppetmaster2002

2.75.75.75 (2)

an-presto1004, kafka-jumbo1006

2.80.80.80 (2)

db1106, mw2264

2.81.81.81 (2)

contint2001, ores2005

2.82.82.82 (1)

wdqs2005

2.83.83.83 (7)

lvs1013, lvs1015, lvs1016, ms-be2040, puppetmaster2001, thumbor2003, thumbor2004

3.15.17.15 (22)

an-coord1001, backup1001, bast2002, cloudservices1004, cloudvirt1023, cumin1001, db1124, db1125,
dbproxy1012, dbproxy1013, dbproxy1014, dbproxy1015, dbproxy1016, dbproxy1017, mwmaint1002, pki1001,
puppetmaster1003, rdb1009, rdb1010, scandium, snapshot1008, snapshot1009

3.21.21.21 (70)

an-master1001, an-master1002, cloudcontrol2001-dev, cloudvirt1026, cloudvirt1027, cloudvirt1028,
cloudvirt1029, cloudvirt1030, db1126, db1127, db1128, db1129, db1130, db1132, db1134, db1135,
db1136, db1137, db1138, db1183, db2096, db2114, dbstore1003, dbstore1005, elastic2039, elastic2040,
elastic2041, elastic2042, elastic2044, elastic2045, elastic2046, elastic2047, elastic2048,
elastic2051, elastic2052, elastic2054, logstash2001, logstash2002, ms-be1044, ms-be1045, ms-be1046,
ms-be1047, ms-be1048, ms-be1049, ms-be1050, ms-be2044, ms-be2045, ms-be2046, ms-be2047, ms-be2048,
ms-be2049, restbase1019, restbase1020, restbase1021, restbase1022, restbase1023, restbase1024,
restbase1025, restbase1026, restbase1027, restbase2013, restbase2015, restbase2016, restbase2018,
restbase2019, restbase2020, sessionstore2001, sessionstore2002, sessionstore2003, stat1007

3.21.26.22 (4)

elastic2038, logstash1010, logstash1011, logstash1012

Unsupported hosts which can be upgrade automaticly (518):
3.30.30.30 (52)

an-conf1002, an-conf1003, cloudcephmon1001, cloudcephmon1002, cloudcephmon1003, db1133, db2103,
db2104, db2105, db2106, db2109, db2110, db2111, db2113, db2115, db2116, db2117, db2118, db2119,
db2120, dbprov1001, dbprov1002, dbprov2001, dbprov2002, dbproxy1018, dbproxy1019, dbproxy1020,
dbproxy1021, dbproxy2001, dbproxy2002, dbproxy2003, dbproxy2004, ganeti1009, ganeti1010, ganeti1011,
ganeti1012, ganeti1013, ganeti1014, ganeti1015, ganeti1016, ganeti1017, ganeti1018, ganeti1019,
ganeti1020, ganeti1021, ganeti1022, ganeti2018, gerrit1001, kafka-main1001, kafka-main1002,
kafka-main1003, krb1001

3.32.32.32 (2)

cloudbackup2001, cloudbackup2002

3.34.34.34 (50)

an-conf1001, db2121, db2122, db2123, db2124, db2126, db2128, db2129, db2130, db2131, dumpsdata1003,
elastic1054, elastic1057, elastic1067, elastic2050, mw1349, mw1350, mw1351, mw1352, mw1353, mw1354,
mw1355, mw1356, mw1357, mw1358, mw1359, mw1361, mw1362, mw1363, mw1364, mw1365, mw1366, mw1367,
mw1368, mw1369, mw1370, mw1371, mw1372, mw1373, mw1374, mw1375, mw1376, mw1377, mw1378, mw1379,
mw1380, mw1381, mw1382, mw1383, mw1384

3.36.36.36 (21)

cloudvirt-wdqs1002, cloudvirt-wdqs1003, db1131, db2132, db2133, db2134, db2135, es1020, es1021,
es1022, es1023, es1024, es1025, es2020, es2022, es2023, es2024, es2025, kafka-jumbo1007,
kafka-jumbo1008, kafka-jumbo1009

4.0.0.0 (172)

an-druid1001, an-druid1002, cloudelastic1005, druid1007, druid1008, elastic2055, elastic2056,
elastic2057, elastic2058, elastic2059, elastic2060, ganeti2019, ganeti2020, ganeti2021, ganeti2022,
ganeti2023, ganeti2024, htmldumper1001, kubernetes1007, kubernetes1008, kubernetes1009,
kubernetes1010, kubernetes1011, kubernetes1012, kubernetes1013, kubernetes1014, kubernetes2007,
kubernetes2008, kubernetes2009, kubernetes2010, kubernetes2011, kubernetes2012, kubernetes2013,
kubernetes2014, kubestage2001, kubestage2002, mw1385, mw1386, mw1387, mw1388, mw1389, mw1390,
mw1391, mw1392, mw1393, mw1394, mw1395, mw1396, mw1397, mw1398, mw1399, mw1400, mw1401, mw1402,
mw1403, mw1404, mw1405, mw1406, mw1407, mw1408, mw1409, mw1410, mw1411, mw1412, mw1413, mw2291,
mw2292, mw2293, mw2294, mw2295, mw2296, mw2297, mw2298, mw2299, mw2300, mw2301, mw2302, mw2303,
mw2304, mw2305, mw2306, mw2307, mw2308, mw2309, mw2310, mw2311, mw2312, mw2313, mw2314, mw2315,
mw2316, mw2317, mw2318, mw2319, mw2320, mw2321, mw2322, mw2323, mw2324, mw2325, mw2326, mw2327,
mw2328, mw2329, mw2330, mw2331, mw2332, mw2333, mw2334, mw2335, mw2337, mw2338, mw2339, mw2350,
mw2351, mw2352, mw2353, mw2354, mw2355, mw2356, mw2357, mw2358, mw2359, mw2360, mw2361, mw2362,
mw2363, mw2364, mw2365, mw2366, mw2367, mw2368, mw2369, mw2370, mw2371, mw2372, mw2373, mw2374,
mw2375, mw2376, parse2001, parse2002, parse2003, parse2004, parse2005, parse2006, parse2007,
parse2008, parse2009, parse2010, parse2011, parse2012, parse2013, parse2014, parse2015, parse2016,
parse2017, parse2018, parse2019, parse2020, restbase1028, restbase1029, restbase1030, restbase2021,
restbase2022, restbase2023, snapshot1010, stat1008, wdqs1011, wdqs1012, wdqs1013, wdqs2008

4.10.10.10 (83)

alert1001, alert2001, an-test-coord1001, an-test-master1001, an-test-master1002, an-test-worker1001,
an-test-worker1002, an-test-worker1003, an-worker1096, an-worker1097, an-worker1098, an-worker1099,
an-worker1100, an-worker1101, an-worker1102, an-worker1103, an-worker1104, an-worker1105,
an-worker1106, an-worker1107, an-worker1108, an-worker1109, an-worker1110, an-worker1111,
an-worker1112, an-worker1113, an-worker1114, an-worker1115, an-worker1116, an-worker1117,
backup1002, backup2002, cloudcephosd1006, cloudcephosd1007, cloudcephosd1008, cloudcephosd1009,
cloudcephosd1011, cloudcephosd1012, cloudcephosd1013, cloudcephosd1014, cloudcephosd2001-dev,
cloudcephosd2002-dev, cloudcephosd2003-dev, cloudcontrol1005, cloudcontrol2004-dev, cloudvirt1031,
cloudvirt1032, cloudvirt1033, cloudvirt1034, cloudvirt1035, cloudvirt1036, cloudvirt1037,
cloudvirt1039, cp2042, db1141, db1142, db1143, db1144, db1145, db1146, db1147, db1148, db1149,
db2137, db2138, db2139, rdb2007, rdb2008, restbase2014, thanos-be1001, thanos-be1002, thanos-be1003,
thanos-be1004, thanos-be2001, thanos-be2002, thanos-be2003, thanos-be2004, thanos-fe1001,
thanos-fe1002, thanos-fe1003, thanos-fe2001, thanos-fe2002, thanos-fe2003

4.20.20.20 (68)

clouddb1013, clouddb1014, clouddb1015, clouddb1016, clouddb1017, clouddb1018, clouddb1019,
clouddb1020, db1151, db1152, db1153, db1154, db1155, db2141, db2142, db2143, db2144, dbprov1003,
dbprov2003, deploy2002, es1026, es1027, es1028, es1029, es1030, es1031, es1032, es1033, es1034,
es2027, es2028, es2029, es2030, es2032, es2033, es2034, kubernetes1017, kubernetes2017,
logstash2033, logstash2034, logstash2035, maps1005, maps1006, maps1007, maps1008, maps1009,
maps1010, maps2005, maps2006, maps2007, maps2008, maps2009, maps2010, ml-serve2001, ml-serve2002,
ml-serve2003, ml-serve2004, ms-be1060, ms-be1061, ms-be1062, ms-be1063, ms-be2057, ms-be2059,
ms-be2060, ms-be2061, mwlog2002, rdb2009, rdb2010

4.22.0.0 (2)

backup2001, es2026

4.22.0.53 (8)

an-tool1010, cloudcephosd1015, db2125, db2127, deploy1002, elastic2037, mw1360, wdqs2007

4.32.10.0 (60)

backup2003, cloudcephmon2004-dev, cloudgw2002-dev, cloudvirt1038, conf2004, conf2005, conf2006,
cumin2002, db1162, db2145, db2146, db2147, db2148, db2149, db2150, db2151, db2152,
kafka-logging2001, kafka-logging2002, kafka-logging2003, moss-fe2001, moss-fe2002, ms-backup2001,
ms-backup2002, mw1451, mw2377, mw2378, mw2379, mw2380, mw2381, mw2382, mw2384, mw2385, mw2386,
mw2387, mw2388, mw2389, mw2390, mw2391, mw2392, mw2393, mw2394, mw2395, mw2396, mw2397, mw2398,
mw2399, mw2400, mw2401, mw2402, mw2403, mw2404, mw2405, mw2406, mw2407, mw2408, mw2409, mw2410,
mw2411, mwmaint2002

The above info was generated with the following script (also available on puppetdb1002:/home/jbond/pql/idrac_upgrade.py)

#!/usr/bin/env python3
from collections import defaultdict
from pypuppetdb import connect
from pypuppetdb.QueryBuilder import RegexOperator
from os.path import basename
from packaging.version import Version
from textwrap import fill

def main():
    idrac_upgrades = {
        'manual': defaultdict(list),
        'unsupported': defaultdict(list),
    }
    good_lower = Version('4.40.0.0')
    unsupported_lower = Version('3.30.0.0')
    db = connect()
    pql = """inventory[certname, facts] { facts.firmware_idrac ~ '^[1234].+' }"""
    nodes = db.pql(pql)
    for node in nodes:
        version = Version(node['facts']['firmware_idrac'])
        if version < unsupported_lower:
            idrac_upgrades['manual'][str(version)].append(node['certname'].split('.')[0])
            continue
        if version < good_lower:
            idrac_upgrades['unsupported'][str(version)].append(node['certname'].split('.')[0])
            continue

    manual_count = len([h for hosts in idrac_upgrades['manual'].values() for h in hosts])
    unsupported_count = len([h for hosts in idrac_upgrades['unsupported'].values() for h in hosts])
    print(f'===== Hosts that require Manual upgrade ({manual_count}):')
    for version, hosts in dict(sorted(idrac_upgrades['manual'].items())).items():
        print(f'====== {version} ({len(hosts)})')
        print(fill(', '.join(sorted(hosts)), width=100, break_on_hyphens=False))
        print()

    print(f'\n===== Unsupported hosts which can be upgrade automaticly ({unsupported_count}):')
    for version, hosts in dict(sorted(idrac_upgrades['unsupported'].items())).items():
        print(f'====== {version} ({len(hosts)})')
        print(fill(', '.join(sorted(hosts)), width=100, break_on_hyphens=False))
        print()


if __name__ == '__main__':
    raise SystemExit(main())

Event Timeline

jbond triaged this task as Medium priority.Feb 1 2023, 7:05 PM
jbond created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Change 885864 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/cookbooks@master] sre.hardware.upgrade-firmware: move version check to earlier

https://gerrit.wikimedia.org/r/885864

Change 885864 merged by jenkins-bot:

[operations/cookbooks@master] sre.hardware.upgrade-firmware: move version check to earlier

https://gerrit.wikimedia.org/r/885864

Change 890828 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/cookbooks@master] sre.hardware.upgrade-firmware: use upload_file if supported

https://gerrit.wikimedia.org/r/890828

Change 890827 had a related patch set uploaded (by Jbond; author: John Bond):

[operations/cookbooks@master] sre.hardware.upgrade-firmware: switch to using _upload_session

https://gerrit.wikimedia.org/r/890827

Change 890827 merged by jenkins-bot:

[operations/cookbooks@master] sre.hardware.upgrade-firmware: switch to using _upload_session

https://gerrit.wikimedia.org/r/890827

Change 890828 merged by jenkins-bot:

[operations/cookbooks@master] sre.hardware.upgrade-firmware: use upload_file if supported

https://gerrit.wikimedia.org/r/890828

@jbond as for 10 Mars 2023 the IDRAC latest version for PowerEdge R430 is 2.84 or to be able to run the firmware cookbook Redfish wants the idrac to be at minimum 3.30 . So I think we will have to take out of the list all PE R430. We have a Total of 93 PE R430 and must of those servers just hit the mark of 5 years in 2023 and some are older then 5 years so I thinking that those servers will soon be decommissioned.

What do you think?

The first 51 servers on the list are R430 since we can not do any for those we are left with 209 servers out of 260.