Problem
As part of the latest project on network isolation, there are new per-rack VLANs subnets allocated called cloud-private. See https://phabricator.wikimedia.org/T324992#8671971.
The data is copied here for reference:
"supernet": 172.20.0.0/16
Vlan Name Vlan ID Subnet cloud-private-c8-eqiad 1151 172.20.1.0/24 cloud-private-d5-eqiad 1152 172.20.2.0/24 cloud-private-e4-eqiad 1153 172.20.3.0/24 cloud-private-f4-eqiad 1154 172.20.4.0/24 cloud-private-b1-codfw 2151 172.20.5.0/24
These new IP addresses will be allocated and assigned per physical hardware host, in parallel to the traditional 10.x.y.z addresses that we know and love for ssh/puppet/management, etc.
The 10.x.y.z addresses use the <datacenter>.wmnet naming, but since these new addresses are considered natively cloud realm (even though not virtual), we won't be using wmnet.
This decision request is to decide on the subdomain to use for them.
Our current ""policy"" for domain names is at https://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS/Admin/DNS, which should be updated with the results of this decision.
As of today, the policy suggests that the domain we should use is wikimedia.cloud, because it replaced eqiad.wmflabs which was the cloud counterpart to eqiad.wmnet.
Constraints and risks
- Make sure whatever domain we use makes it clear that they are HW servers and not virtual machines.
- This wont be really exposed to end-users/customers, so we have a bit more freedom to pick one option and have second thoughts a couple of years later.
- We already have some precedents in enwiki.analytics.db.svc.wikimedia.cloud FQDNs. They use the svc subdomain. Such subdomain is not very fitted for this case since these aren't service IP addresses.
- The chosen subdomain must be hosted by wikiland DNS servers to avoid chicken-egg problems (the domain being unavailable because the cloud being down, but the config of some core cloud service relying on the FQDNs for startup)
Decision record
Options
Option 1
Use <dc>.wikimedia.cloud.
Examples:
- cloudcontrol1003.eqiad.wikimedia.cloud
- cloudlb2001-dev.codfw.wikimedia.cloud
Pros:
- simple and straight forward 'mirror' of the <dc>.wmnet scheme.
Cons:
- in some cases may be too similar to VM FQDNs, like whatever.project.eqiad1.wikimedia.cloud.
Option 2
Use <dc>.hw.wikimedia.cloud.
Examples:
- cloudcontrol1003.eqiad.hw.wikimedia.cloud
- cloudlb2001-dev.codfw.hw.wikimedia.cloud
Pros:
- Explicit hw keyword (meaning: hardware), should help clearly identify this is an IP in hardware and not on a virtual machine.
Cons:
- Slightly longer to type.
Option 3
Use <vlan>.wikimedia.cloud
Examples:
- cloudcontrol1003.cloud-private-c8-eqiad.wikimedia.cloud
- cloudlb2001-dev.cloud-private-b1-codfw.wikimedia.cloud
Pros:
- Extra clear what this is about, as it hardcodes in an explicit fashion the DC, the rack and the vlan name.
Cons:
- Long and complex to type.
- If a host is relocated into a different rack, the FQDN will need to be updated, making them less time-stable than other options.
Option 3bis
Use <vlan-shortname>.<dc>.wikimedia.cloud
Examples:
- cloudcontrol1003.private.eqiad.wikimedia.cloud
- cloudlb2001-dev.private.codfw.wikimedia.cloud
Pros:
- Extra clear what this is about, as it hardcodes in an explicit fashion the DC, and the vlan [short] name.
Cons:
- none!