Page MenuHomePhabricator

hw troubleshooting: Link hard down (probably cable) for cloudcephosd2002-dev.codfw.wmnet
Closed, ResolvedPublicRequest

Description

  • - Provide FQDN of system.

cloudcephosd2002-dev.codfw.wmnet

  • - If other than a hard drive issue, please depool the machine (and confirm that it’s been depooled) for us to work on it. If not, please provide time frame for us to take the machine down.

You can take it down anytime (it's still powered on, but feel free to turn it off).

  • - Put system into a failed state in Netbox.
  • - Provide urgency of request, along with justification (redundancy, dependencies, etc)

It's not a very urgent request, as only affects our testing setup. This blocks though in a medium span any changes we want to make to production (ex. 316544)

The link shows hard down on both sides (host and switch):

root@cloudcephosd2002-dev:~# ip a show dev eno2
3: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 9000 qdisc mq state DOWN group default qlen 1000
    link/ether 2c:ea:7f:3f:ed:d4 brd ff:ff:ff:ff:ff:ff
    inet 192.168.4.2/24 scope global eno2
       valid_lft forever preferred_lft forever
  • - Assign correct project tag and appropriate owner (based on above). Also, please ensure the service owners of the host(s) are added as subscribers to provide any additional input.

Related Objects

Event Timeline

@Jhancock.wm can you please comment what troubleshooting you did today for this link?

Thanks

I found the port down with no link indicator lights. I reseated both ends but it did not help. After that I tested with a new patch and it also did not show activity. Patch was normalized to original cable.

@Jhancock.wm what you can try to do if you are on site before me is plug a patch cable to ge-1/0/1 and the other end to sretest2001 and see if you have link if not we will try to use another port on the switch.

Thanks

@dcaro it looks like we are having a bad interface on asw-b1-codfw(ge-1/0/1) so i switch the server connection to ge-1/0/12. You should be all good now. Let me know if you have any questions.

Interface       Admin Link Description
ge-1/0/12       up    up   cloudcephosd2002-dev

papaul@asw-b-codfw# run show interfaces ge-1/0/1 descriptions
Interface       Admin Link Description
ge-1/0/1        down  down DISABLED

Thanks a lot! The cluster is already working on restoring itself :)