I am opening this task to specifically look at migrating the cloudcephosd* hosts to a single network uplink, from their current dual-link setup. The other cloud hosts have already been moved to a single uplink under T319184.
Current Setup
Cloudcephosd hosts connect to two separate networks. Following the previous best practice from the ceph project they are configured to use a public and cluster network. The cloudcephosd hosts use their WMF production realm / primary network uplink for the 'public' ceph connectivity, and have a second port connected to a cloud-storage1 vlan which they use for the 'cluster' connectivity.
Issue
The issue with the current setup is that the use of two ports increases the cost and complexity of the network infrastructure, as it means a single top-of-rack switch has insufficient ports to serve a full rack of cloudcephosd hosts. They are the only hosts in our DCs with dual connections, which complicates our provisioning/automation, and as things stand means we need to do manual edits to get things working.
If we examine our ceph nodes we see the peak combined usage across each host's 2x10G links does not exceed 10G/sec. And the peak rates are only occasionally observed, the typical combined usage across both ports is in the region of 1-2Gb/sec. So it seems in terms of bandwidth a single connection will suffice (and further there is some scope to use 25G links where available).
Setup/migration
To move from dual network links to a single one we will add a new vlan sub-interface to each host's primary network uplink. This separate logical interface will connect the cluster/storage subnet on each host, instead of the second physical port.
The vlans used for storage are as follows (racks C8 & D5 share the same subnet/vlan due to the way things evolved historically):
| Site | Rack | Vlan ID | Subnet | Vlan Name |
|---|---|---|---|---|
| eqiad | C8 | 1106 | 192.168.4.0/24 | cloud-storage1-eqiad |
| eqiad | D5 | 1106 | 192.168.4.0/24 | cloud-storage1-eqiad |
| eqiad | E4 | 1121 | 192.168.5.0/24 | cloud-storage1-e4-eqiad |
| eqiad | F4 | 1122 | 192.168.6.0/24 | cloud-storage1-f4-eqiad |
| codfw | B1 | 2106 | 192.168.4.0/24 | cloud-storage1-b-codfw |
The IPs configured for the storage network are (afaik) set up in puppet here, which also references the interface the storage IP goes on (currently the physical second link).
We need to discuss and work out the exact way to configure the new interface in puppet, and also how to introduce it gracefully and migrate from the current setup. At a very high level an approach like this might work:
- Ensure the storage vlan is trunked to the primary interface of all the cloudcpehosd's on the switch side (non-disruptive, netops can do it)
- Create puppet patches to add the new vlan-subinterface for the appropriate vlan id as a child of the main physical
- Similar to how the cloud-private is added on other hosts
- Merely creating the interface - with no IPs on it - won't cause any existing traffic paths to change
- Starting with the new hosts we can then change cluster network 'iface' in hiera from the physical second port to the new vlan interface
- We also need to make sure the aggregate 192.168.0.0/16 route is present (it should be)
- Once all cloudcephosd hosts have the 'iface' for the cluster network as the sub-int we can remove the second physical links
Happy to discuss further. We can probably trial the setup/automation on the new high-density ceph hosts (T394333).
Procedures
Logical side
- For each double-nic host:
- Send a puppet change flipping single_iface: true for its configuration
- Merge and apply the puppet change on the affected host
- Verify ceph is happy and stays happy, e.g. ceph health on cloudcontrol1006
- If the host fails to come back on the network after the puppet run:
- Revert the single_iface: true puppet patch and merge
- Use the host console to log in and restore connectivity by configuring the second interface
- Run puppet on the host for clean up actions
Physical side
Assuming the above is complete for all hosts (eqiad and codfw) we can proceed with:
- Disable extra ports in netbox, and deploy changes
- Unplug extra network cables
Status
Logical side done
- cloudcephosd1035
- cloudcephosd1036
- cloudcephosd1037
- cloudcephosd1038
- cloudcephosd1039
- cloudcephosd1040
- cloudcephosd1041
- cloudcephosd1042
- cloudcephosd1043
- cloudcephosd1044
- cloudcephosd1045
- cloudcephosd1046
- cloudcephosd1047
- cloudcephosd1048
- cloudcephosd1049
- cloudcephosd1050
- cloudcephosd1051
- cloudcephosd1052
- cloudcephosd2004-dev.codfw.wmnet
- cloudcephosd2005-dev.codfw.wmnet
- cloudcephosd2006-dev.codfw.wmnet
- cloudcephosd2007-dev.codfw.wmnet