Page MenuHomePhabricator

10G ports seem not to work on new HP hardware
Closed, ResolvedPublic

Description

We have what seems like a common thread across new HP hardware. I have 4 examples of this to my knowledge where we have ordered machines with 10G NICs and they do not work.

labvirt1019
labvirt1020
labnet1003
labnet1004

So far we have either fallen back to 1G to make some progress (https://phabricator.wikimedia.org/T193196#4267196) or 1G is legitimately fine. For labnet100[34] at least we need 10G to function, in the case of labvirt1019 and 1020 1G would actually be fine as that's been the labvirt standard to this point. But the mystery remains.

T193196: labnet1003 and labnet1004 moving and enabling 10G NICs
T194964: Connect or troubleshoot eth1 on labvirt1019 and labvirt1020

Event Timeline

chasemp created this task.

@Cmjohnson could you describe a bit what you've tried to get the 10G ports to work?

ping @aborrero who indicated he had seem a similar issue in the past

Are these all the same type of machines? How spread at the MAC addresses? Saw something like this a number of years ago, and it turned out to be a bad run of Broadcom chips that had then been installed in to Dell servers, and we were able to identify the bad ones from the nearly-sequential MACs.

What are the symptoms?

Furthermore, I noticed in one of the task PXE being mentioned. Is the issue just with PXE (e.g. not showing up in the boot order) or does it happen under Linux as well?

So for at least labvirt1019 it was indeed about PXE not working (the card worked under Linux) and that was due to a BIOS misconfiguration (the "network boot" option for the card set to disabled). T194964#4283034 has more details and troubleshooting steps.

I haven't verified whether that's the case for labnet1003/1004 or even labvirt1020. I guess we should do that first, before resolving this task?

ping @aborrero who indicated he had seem a similar issue in the past

I had to use some non-free drivers in the past for HP servers for 10G fiber ports. But I don't remember the specific hardware vendor of the PCI card.

Do you have lspci -v?

Vvjjkkii renamed this task from 10G ports seem not to work on new HP hardware to e2aaaaaaaa.Jul 1 2018, 1:04 AM
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: Aklapper, gerritbot.
elukey renamed this task from e2aaaaaaaa to 10G ports seem not to work on new HP hardware.Jul 2 2018, 6:11 AM
elukey lowered the priority of this task from High to Medium.
elukey updated the task description. (Show Details)
chasemp claimed this task.

So for at least labvirt1019 it was indeed about PXE not working (the card worked under Linux) and that was due to a BIOS misconfiguration (the "network boot" option for the card set to disabled). T194964#4283034 has more details and troubleshooting steps.

I haven't verified whether that's the case for labnet1003/1004 or even labvirt1020. I guess we should do that first, before resolving this task?

seems as if things are resolved on labnet1003/4 as well. Closing this until we know otherwise :)