Page MenuHomePhabricator

Remove second network connection for cloudcephosd hosts with single uplink
Closed, ResolvedPublic

Description

As per title, we need to remove extra interfaces from netbox for the following hosts and run homer:

cloudcephosd1035
cloudcephosd1036
cloudcephosd1037
cloudcephosd1038
cloudcephosd1039
cloudcephosd1040
cloudcephosd1041
cloudcephosd1042
cloudcephosd1043
cloudcephosd1044
cloudcephosd1045
cloudcephosd1046
cloudcephosd1047
cloudcephosd1048
cloudcephosd1049
cloudcephosd1050
cloudcephosd1051
cloudcephosd1052
cloudcephosd2004-dev.codfw.wmnet
cloudcephosd2005-dev.codfw.wmnet
cloudcephosd2006-dev.codfw.wmnet
cloudcephosd2007-dev.codfw.wmnet

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

@fgiunchedi have all the cables been removed on site?

Typically I would ask DC-Ops to remove in Netbox when they remove on site, to minimize the length of time there is any discrepancy between Netbox and reality.

cmooney claimed this task.

and run homer

what we can do if the second port being in an "up" state on the switch is a problem is _disable_ the interface in Netbox, but leave the cable connected there until we have it removed on site.

@fgiunchedi have all the cables been removed on site?

Typically I would ask DC-Ops to remove in Netbox when they remove on site, to minimize the length of time there is any discrepancy between Netbox and reality.

I see, thank you I was not aware of the procedure and it makes sense!

To answer your question: no cables have been touched yet; and I'm good with whichever SOP is in place for this type of operations, what do you recommend to happen next?

I see, thank you I was not aware of the procedure and it makes sense!

Yeah the main thing I'm concerned about is that the cabling in Netbox reflects whatever is on site as much as possible.

To answer your question: no cables have been touched yet; and I'm good with whichever SOP is in place for this type of operations, what do you recommend to happen next?

Tbh we are sort-of outside normal process here, as so few hosts have second ports.

Thinking it through what is probably best:

  1. We disable the switch interfaces terminating these second ports now
  2. We re-run the PuppetDB import script for these hosts to update their Netbox interface based on current status
  3. We ask DC-Ops to remove all the cables on site, deleting each one in Netbox as they go

I can take care of 1 and 2 I think, then we can ask dc-ops to do the rest.

cmooney renamed this task from Remove extra netbox interfaces for cloudcephosd hosts with single uplink to Remove second network connection for cloudcephosd hosts with single uplink.Nov 25 2025, 1:18 PM

Thinking it through what is probably best:

  1. We disable the switch interfaces terminating these second ports now
  2. We re-run the PuppetDB import script for these hosts to update their Netbox interface based on current status
  3. We ask DC-Ops to remove all the cables on site, deleting each one in Netbox as they go

I can take care of 1 and 2 I think, then we can ask dc-ops to do the rest.

Sounds great -- thank you very much !

Mentioned in SAL (#wikimedia-operations) [2025-12-03T16:15:09Z] <topranks> disabling unused former cloudcephosd hosts on cloud switches T410989

cmooney edited projects, added SRE, ops-eqiad, ops-codfw; removed Infrastructure-Foundations, netops.

Ok I've disabled all the unused ports on the cloud switches now. The one exception is for cloudcephosd1052, not sure what is up with this one but it seems that it has the vlan interface added, but still has the physical link configured and is using it? I didn't want to touch it:

cmooney@cloudcephosd1052:~$ ip -4 -br addr show | grep -v DOWN
lo               UNKNOWN        127.0.0.1/8 
ens1f0np0        UP             10.64.148.31/24 
ens1f1np1        UP             192.168.5.14/24 
vlan1121@ens1f0np0 UP             192.168.5.14/24
cmooney@cloudcephosd1052:~$ ip route get fibmatch 192.168.5.1 
192.168.5.0/24 dev ens1f1np1 proto kernel scope link src 192.168.5.14

DC-Ops folks we can now remove these superflous cables from the racks, and once removed delete the cable in Netbox too.

This is a list of all of them:

eqiad

cloudcephosd1035: https://netbox.wikimedia.org/dcim/cables/8315/
cloudcephosd1042: https://netbox.wikimedia.org/dcim/cables/10076/
cloudcephosd1043: https://netbox.wikimedia.org/dcim/cables/10078/

cloudcephosd1036: https://netbox.wikimedia.org/dcim/cables/8316/
cloudcephosd1044: https://netbox.wikimedia.org/dcim/cables/10109/
cloudcephosd1045: https://netbox.wikimedia.org/dcim/cables/10114/
cloudcephosd1046: https://netbox.wikimedia.org/dcim/cables/10116/

cloudcephosd1037: https://netbox.wikimedia.org/dcim/cables/8314/
cloudcephosd1038: https://netbox.wikimedia.org/dcim/cables/8312/
cloudcephosd1047: https://netbox.wikimedia.org/dcim/cables/10118/
cloudcephosd1050: https://netbox.wikimedia.org/dcim/cables/9968/
cloudcephosd1051: https://netbox.wikimedia.org/dcim/cables/9967/

cloudcephosd1039: https://netbox.wikimedia.org/dcim/cables/8178/
cloudcephosd1040: https://netbox.wikimedia.org/dcim/cables/8179/
cloudcephosd1041: https://netbox.wikimedia.org/dcim/cables/8180/
cloudcephosd1048: https://netbox.wikimedia.org/dcim/cables/10177/
cloudcephosd1049: https://netbox.wikimedia.org/dcim/cables/10119/

codfw

cloudcephosd2004-dev: https://netbox.wikimedia.org/dcim/cables/9933/
cloudcephosd2005-dev: https://netbox.wikimedia.org/dcim/cables/9785/
cloudcephosd2006-dev: https://netbox.wikimedia.org/dcim/cables/9809/
cloudcephosd2007-dev: https://netbox.wikimedia.org/dcim/cables/9801/

the four servers in codfw have had cables physically removed and deleted in netbox.

the four servers in codfw have had cables physically removed and deleted in netbox.

Super, thanks @Jhancock.wm !!

These cables at eqiad have been physically removed and deleted in netbox.

cmooney reopened this task as Open.EditedDec 4 2025, 11:57 AM
cmooney removed a project: ops-eqiad.

Thanks @VRiley-WMF. I'm gonna re-open this as we still have to deal with cloudcephosd1052.

Also can you double-check all the cables are deleted? Clicking on a few random ones I still see some there, for instance https://netbox.wikimedia.org/dcim/cables/10118/

JFYI we can now proceed with cloudcephosd1052 too

Adding back ops-eqiad for visibility

Thanks for this. I have unplugged the secondary cable for cloudcephosd1052. I have also went through the cables and deleted them from netbox.