Page MenuHomePhabricator

rearrange networking for cloudceph200[1-3]-dev and rename
Closed, ResolvedPublic

Description

Event Timeline

Andrew added a subscriber: Papaul.

@Papaul, are these things you can help with or do we need Arzhel? (I can do most of the host renaming of course).

Change 642408 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Rename cloudceph200x-dev to cloudcephosd200x-dev

https://gerrit.wikimedia.org/r/642408

Andrew updated the task description. (Show Details)

@Papaul, if you want to make the netbox/network changes, I can do the actual re-imaging. There's nothing happening on these boxes currently so you can break them whenever :)

Andrew renamed this task from rearrange networking for cloudceph200[1-3]-dev to rearrange networking for cloudceph200[1-3]-dev and rename.Nov 23 2020, 5:32 PM
Andrew triaged this task as Medium priority.

eth0 was in public1-b-codfw
eth1 was in private1-b-codfw

As for the Netbox side of things I think that with this procedure we should be able to reach the desired status (experimental, never tried):

  1. From https://netbox.wikimedia.org/dcim/devices/2637/ delete the IPv4 and IPv6 currently assigned
  2. From https://netbox.wikimedia.org/ipam/prefixes/147/ip-addresses/ add an IP Address and set the DNS Name
  3. From https://netbox.wikimedia.org/ipam/prefixes/226/ip-addresses/ add an IP Address with the mapped version of the IPv4 created in step (2) and set the DNS Name (unless the service doesn't support v6)
  4. From https://netbox.wikimedia.org/dcim/interfaces/9637/ rename eno1 to ##PRIMARY##
  5. From https://netbox.wikimedia.org/dcim/devices/2637/ click on the green plus button on the ##PRIMARY## line and attach the IPs created in step (2) and (3)
  6. From https://netbox.wikimedia.org/dcim/devices/2637/ mark the two IPs as primary

As for the secondary IP that's on a VLAN whose prefixes are not managed by Netbox so should not appear at all

  1. edit the switch ports to match the new access vlans (eg. https://netbox.wikimedia.org/dcim/interfaces/16057/edit/ -> Untagged VLAN -> cloud-hosts-b-codfw) then https://netbox.wikimedia.org/dcim/interfaces/17740/ cloud-storage

If you do 2 before 5 it will not work. you get error " interface exist already" . so you don't have to to 2 . The final steps are

  1. From https://netbox.wikimedia.org/dcim/devices/2637/ delete the IPv4 and IPv6 currently assigned
  2. From https://netbox.wikimedia.org/dcim/interfaces/9637/ rename eno1 to PRIMARY
  3. From https://netbox.wikimedia.org/dcim/devices/2637/ click on the green plus button on the PRIMARY line and attach an IPs
  4. From https://netbox.wikimedia.org/dcim/devices/2637/ mark the two IPs as primary
  5. Edit the switch ports to match the new access vlans (eg. https://netbox.wikimedia.org/dcim/interfaces/16057/edit/ -> Untagged VLAN -> cloud-hosts-b-codfw) then https://netbox.wikimedia.org/dcim/interfaces/17740/ cloud-storage

@ayounsi there is no interface range for VLAN cloud-storage-b-codfw in row b. Do you want me to create it, or just assign the second interface of those servers directly to the VLAN?

As soon as you update Netbox with the proper vlan on the switch interface ("https://netbox.wikimedia.org/dcim/interfaces/17740/ cloud-storage"), homer will configure it automatically.

I realized now that we should have decomm'ed them because of the rename, also to clear puppetdb/debmonitor/icinga. But if I'll do that now they will clear the already assigned IPs, so I will instead manually remove them from puppetdb/debmonitor that will in turn remove them from icinga too.

[edit interfaces]
    interface-range vlan-cloud-hosts1-b-codfw { ... }
+   interface-range vlan-cloud-storage1-b-codfw {
+       member ge-1/0/5;
+       mtu 9192;
+       unit 0 {
+           family ethernet-switching {
+               interface-mode access;
+               vlan {
+                   members cloud-storage1-b-codfw;
+               }
+           }
+       }
+   }
    interface-range vlan-private1-a-codfw { ... }
[edit interfaces interface-range vlan-private1-b-codfw]
-    member ge-1/0/5;
[edit interfaces interface-range vlan-cloud-storage1-b-codfw]
     member ge-1/0/5 { ... }
+    member ge-5/0/10;
+    member ge-8/0/11;
[edit interfaces interface-range vlan-private1-b-codfw]
-    member ge-5/0/10;
-    member ge-8/0/11;

@Andrew network and DNS setup complete server is ready for reimage.

Change 642408 merged by Andrew Bogott:
[operations/puppet@production] Rename cloudceph200x-dev to cloudcephosd200x-dev

https://gerrit.wikimedia.org/r/642408

Change 643275 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Move cloudcephosd2xxx-dev from .wikimedia.org to .codfw.wmnet

https://gerrit.wikimedia.org/r/643275

Change 643275 merged by Andrew Bogott:
[operations/puppet@production] Move cloudcephosd2xxx-dev from .wikimedia.org to .codfw.wmnet

https://gerrit.wikimedia.org/r/643275

I'm a bit lost in the backscroll :) Did the second nics get attached and assigned for all these? @dcaro points out that they kernel thinks they are disconnected:`

root@cloudcephosd2001-dev:~# grep "" /sys/class/net/eno*/carrier
/sys/class/net/eno1/carrier:1
grep: /sys/class/net/eno2/carrier: Invalid argument

papaul@asw-b-codfw> show interfaces descriptions | match cloudcephosd
ge-1/0/4        up    up   cloudcephosd2001-dev:eno1 {#}
ge-1/0/5        up    up   cloudcephosd2001-dev:eno2 {#}
ge-5/0/8        up    up   cloudcephosd2002-dev:##PRIMARY## {#}
ge-5/0/10       up    up   cloudcephosd2002-dev:eno2 {#}
ge-8/0/10       up    up   cloudcephosd2003-dev:eno1 {#}
ge-8/0/11       up    up   cloudcephosd2003-dev:eno2 {#}
papaul@asw-b-codfw> show ethernet-switching interface ge-1/0/5.0
Routing Instance Name : default-switch
Logical Interface flags (DL - disable learning, AD - packet action drop,
                         LH - MAC limit hit, DN - interface down,
                         SCTL - shutdown by Storm-control,
                         MMAS - Mac-move action shutdown, AS - Autostate-exclude enabled)

Logical          Vlan          TAG     MAC         STP         Logical           Tagging
interface        members               limit       state       interface flags
ge-1/0/5.0                             294912                                     untagged
                 cloud-storage1-b-codfw 2106 294912 Forwarding                    untagged

This is working. Thank you @Papaul!