@fgiunchedi 12 new systems arrived, please let me know how you would like this racked....any particular racks you prefer.
Description
Details
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Resolved | fgiunchedi | T160640 Rack and Setup ms-be1028-ms-1039 | |||
Resolved | fgiunchedi | T163673 Some swift disks wrongly mounted on 5 ms-be hosts |
Event Timeline
racking will be 3x systems per row, 10G where possible or 1G where we can't do 10G (see also T148647)
reporting from IRC
17:51 <godog> cmjohnson1: it is easier to think about it if they are more spread equally among rows, 3x per row should do it, it matters less 10G vs 1G in this case ``
Change 343660 had a related patch set uploaded (by Cmjohnson):
[operations/dns] Adding dns entries for new ms-be servers T160640
Change 343660 merged by Cmjohnson:
[operations/dns] Adding dns entries for new ms-be servers T160640
@fgiunchedi All the servers are racked, cabled, for the most part the ILO is setup. On-Site work still needed is last few ILO configs, and raid setup. Still need to update switch, dhcpd, netboot.cfg and racktables. This should all be completed tomorrow(Thursday).
Change 344646 had a related patch set uploaded (by Cmjohnson):
[operations/puppet@production] T160640 Adding dhcpd entries and netboot.cfg for new swift servers ms-be1028-39
Change 344646 merged by Cmjohnson:
[operations/puppet@production] T160640 Adding dhcpd entries and netboot.cfg for new swift servers ms-be1028-39
Change 344663 had a related patch set uploaded (by Cmjohnson):
[operations/dns@master] T160640 Adding dns entries for production new swift servers ms-be1028-1036
Change 344663 merged by Cmjohnson:
[operations/dns@master] T160640 Adding dns entries for production new swift servers ms-be1028-1036
@Cmjohnson thanks!
I've fixed the raid on the machines (cfr https://wikitech.wikimedia.org/wiki/Platform-specific_documentation/HP_DL3N0_Gen9#ms-be_RAID0_config) that were able to get to debian-installer via PXE and these are now reinstalling:
=== ms-be1028 === ms-be1029 === ms-be1030 === ms-be1035 === ms-be1037 === ms-be1038
These don't seem to successfully PXE boot into debian-installer, could you take a look why is that?
=== ms-be1031 === ms-be1032 === ms-be1033 === ms-be1034
These I don't seem to be able to reach the console
=== ms-be1036 === ms-be1039
Change 345290 had a related patch set uploaded (by Filippo Giunchedi):
[operations/puppet@production] swift: add ms-be1028 -> ms-be1039
Change 345290 merged by Filippo Giunchedi:
[operations/puppet@production] swift: add ms-be1028 -> ms-be1039
Updated the mac address for 1031-34, console issue with 1036, cable was not in correct port. 1039, fat fingered the mgmt ip address during setup.
1031's port was also a member of labs-instances vlan, removed the port from there and disabled/enabled the port and now 1031 can pxe-boot.
1031 / 1032 / 1033 still had their 10G interfaces enabled and thus 1G interfaces would show up starting from eth2. I've disabled the 10G interfaces on all three and machines are now installing
Mentioned in SAL (#wikimedia-operations) [2017-03-30T14:38:29Z] <godog> run stress test (w/ bonnie) on new swift hw - T160640
Mentioned in SAL (#wikimedia-operations) [2017-03-30T18:24:28Z] <godog> swift eqiad-prod add ms-be1028 -> ms-be1039 - T160640
Change 345816 had a related patch set uploaded (by Filippo Giunchedi):
[operations/puppet@production] swift: increase max_connections for object server rsync
Change 345816 merged by Filippo Giunchedi:
[operations/puppet@production] swift: increase max_connections for object server rsync
Just powercycled ms-be1016 that was stuck in console (pingable but no ssh available):
[11674384.225319] BUG: soft lockup - CPU#12 stuck for 22s! [migration/12:149] in ms-be1016's console
When I powercycled I saw:
error: diskfilter writes are not supported. Press any key to continue...
@elukey yes it's a known problem. I have the new part but @fgiunchedi is out this week. We'll take care of it next week. https://phabricator.wikimedia.org/T150206 <<task for ms-be1016
Mentioned in SAL (#wikimedia-operations) [2017-04-24T13:49:53Z] <godog> swift eqiad-prod: more weight on ms-be1028 -> ms-be1039 - T160640
Mentioned in SAL (#wikimedia-operations) [2017-05-08T08:25:11Z] <godog> swift eqiad-prod: ms-be1028/ms-be1039 container/account full weight - T160640
Mentioned in SAL (#wikimedia-operations) [2017-05-08T09:30:24Z] <godog> swift eqiad-prod: ms-be1028/ms-be1039 object weight 2000 - T160640
Mentioned in SAL (#wikimedia-operations) [2017-05-15T08:29:46Z] <godog> swift eqiad-prod: ms-be1028/ms-be1039 object weight 3000 - T160640
Mentioned in SAL (#wikimedia-operations) [2017-05-23T09:49:24Z] <godog> swift eqiad-prod: ms-be1028/ms-be1039 object weight 3500 - T160640
All hosts at weight 4000 and in service, decom task for correspondent old hw is T166489