Page MenuHomePhabricator

relocate/reimage cloudvirt1030 with 10G interfaces
Closed, ResolvedPublic

Description

This host is now drained and ready for re-networking, rebuild, etc.

cloudvirt1030:

  • - put system offline in all checks for maint window
  • - relocate to 10G rack and update netbox
  • - update switch configuration for new primary 10G
  • - enable PXE for 10G primary interface.
  • - attach/cable secondary 10G port for instance traffic, update switch config.
  • - remove old switch config for 1G ports
  • - (update firmware?)
  • - PXE boot and reimage system
  • - reintroduce system into service cluster

Event Timeline

this server does not have a 10GB nic card

@Andrew @Bstorm Do you want me to put this back where it was?

The quote at T201352 lists the combined QLogic 57800 NIC, which should have 10 and 1 GB ports. I do not know if that matches reality, but that is what it says.

From online photos, I'd expect the NIC to have 4 ports, and 2 of those would be 10Gb, if that's the NIC we actually have in the server.

@Bstorm you are correct, that is the nic that is in the server but the 10G capability would require 10GB SFP transceiver. I believe we only have the 1GB transceivers on-site. @wiki_willy @Andrew @Bstorm do we want to order these? for cloudvirt1025-1030?

we will also need cat6 or 6a cable for these 2 each at 7M please.

@Bstorm you are correct, that is the nic that is in the server but the 10G capability would require 10GB SFP transceiver. I believe we only have the 1GB transceivers on-site. @wiki_willy @Andrew @Bstorm do we want to order these? for cloudvirt1025-1030?

Yes please! Sorry, I didn't realize the cards in those boxes were weird :(

++ @RobH - can you create a related procurement task and look into getting a quote for what the WMCS team needs? Much appreciated in advance. Thanks, Willy

@Bstorm you are correct, that is the nic that is in the server but the 10G capability would require 10GB SFP transceiver. I believe we only have the 1GB transceivers on-site. @wiki_willy @Andrew @Bstorm do we want to order these? for cloudvirt1025-1030?

Yes please! Sorry, I didn't realize the cards in those boxes were weird :(

this server does not have a 10GB nic card

These were ordered with: QLogic 57800 2x10Gb BT + 2x1Gb BT Network Daughter 6 - - Card

When I login to the system drac, it shows the following NIC: BRCM 10G/GbE 2+2P 57800-t rNDC

What exactly is being asked for, 10G NIC or just SFP+ & fibers? (We should have on site spares for 10G SFP+ transceivers, just pull one from that?) If we don't have spares, that is an entirely different issue (we should ALWAYS have spare fibers and sfp+ transceivers.)

@Cmjohnson: Everything I see shows 10G nic installed, can you please take a photo of the back of the server and its network connections? I'm not sure where the disconnect is, since you state no 10G but all ordering info and system hardware polling shows 10G NIC.

Pending questions I need addressed to place any orders:

  • confirm cloudvirt1030 has standard 10G sfp+ capable dual ports on its 4 port NIC.
  • detail how many SFP+ transceivers are needed, and how long the single mode fiber optic cable needs to be
    • Is this for within 1 rack? If so, why aren't we using a DAC cable?
    • Is this for a cross-rack 10G connection? If so, where to where and how long?

IRC Update:

  • These are copper based 1g/10g NICs and we in 1g racks so it wasn't an issue until now.
  • We'll need to swap these out entirely with new network cards, I'll paste in a parent task for this shortly (once it is created)
    • we'll need to get 6 of them for all the affected cloudvirts.
RobH mentioned this in Unknown Object (Task).Oct 28 2020, 8:06 PM
RobH added a parent task: Unknown Object (Task).Oct 28 2020, 8:16 PM
RobH removed RobH as the assignee of this task.Oct 30 2020, 11:59 PM
RobH removed a subscriber: RobH.

Change 643345 had a related patch set uploaded (by Cmjohnson; owner: Cmjohnson):
[operations/puppet@production] update dhcp file with new mac's on cloudvirt102[5-9]|30

https://gerrit.wikimedia.org/r/643345

Change 643345 merged by Cmjohnson:
[operations/puppet@production] update dhcp file with new mac's on cloudvirt102[5-9]|30

https://gerrit.wikimedia.org/r/643345

RobH claimed this task.

Please note I was requested to take a look at the batch of 10G moves for cloudvirt10[25-30]. All of the MAC address entries in puppet were off by two digits, likely due to the wrong interface being polled for the MAC address (easy mistake to make). I've updated the puppet repo with all of the corrected addresses, and I'm checking off the above checkboxes for the items I've confirmed. Each reimage will be done by cloud-services-team via parent task T216195. I've updated the checklist on T216195 for each host.