Page MenuHomePhabricator

Phase out use of .wmflabs tld
Closed, ResolvedPublic

Assigned To
Authored By
Andrew
Aug 17 2020, 8:45 PM
Referenced Files
None
Tokens
"Burninate" token, awarded by Krenair."Orange Medal" token, awarded by Krinkle."Party Time" token, awarded by bd808.

Description

Proposed timeline: Do all of the above on September 8, 2020.

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -2
operations/puppetproduction+23 -23
operations/puppetproduction+1 -1
operations/puppetproduction+13 -6
operations/puppetproduction+23 -23
operations/puppetproduction+13 -6
operations/puppetproduction+8 -0
operations/puppetproduction+3 -2
operations/puppetproduction+11 -11
operations/puppetproduction+2 -2
operations/puppetproduction+2 -2
cloud/instance-puppetmaster+0 -114
operations/puppetproduction+2 -2
openstack/horizon/deploytrain+1 -1
openstack/horizon/wmf-puppet-dashboardtrain+1 -1
openstack/horizon/wmf-puppet-dashboardmaster+1 -1
operations/puppetproduction+4 -4
cloud/instance-puppetmaster+0 -347
operations/puppetproduction+13 -0
operations/puppetproduction+2 -2
operations/puppetproduction+6 -3
operations/puppetproduction+3 -3
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Andrew triaged this task as Medium priority.EditedAug 18 2020, 2:46 AM
Andrew updated the task description. (Show Details)
Andrew updated the task description. (Show Details)

Suggested notification email:

Currently cloud-vps stands astride two worlds: wmflabs and wikimedia.cloud. Here's the status quo:

  • New VMs get three different DNS entries: hostname.project.eqiad1.wikimedia.cloud, hostname.project.eqiad.wmflabs, and hostname.eqiad.wmflabs[0]
  • Reverse DNS lookups return hostnames under eqiad1.wikimedia.cloud
  • VMs themselves believe (e.g. via hostname -f) that they're still under eqiad.wmflabs

That hybrid system has done a good job maintaining backwards compatibility, but it's a bit of a mess. In the interest of simplifying, standardizing, and eliminating ever more uses of the term 'labs', we're going to start phasing out the wmflabs domain name. Beginning on September 8th, new VMs will no longer receive any naming associated with .wmflabs.

  • New VMs will get one DNS entry: hostname.project.eqiad1.wikimedia.cloud
  • New VMs will continue to have a pointer DNS entry that refers to the .wikimedia.cloud name
  • New VMs will be assigned an internal hostname under .wikimedia.cloud

In order to avoid breaking existing systems, these changes will NOT be applied retroactively to existing VMs. Old DNS entries will live on until the VM is deleted and should be largely harmless. If, however, you find yourself rewriting code in order to deal with VMs under both domains (due to the change in hostname -f behavior), don't worry -- adjusting an old VM to identify as part of .wikimedia.cloud only requires a simple change to /etc/hosts. I'll be available to make that change for any project that chooses consistency over backwards-compatibility.

[0] https://phabricator.wikimedia.org/phame/post/view/191/new_names_for_everyone/

Change 620936 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Nova/Neutron: set dhcp_domain and tld to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/620936

Change 620937 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] designate: stop creating 'legacy' entries (that is, things under wmflabs)

https://gerrit.wikimedia.org/r/620937

Does puppet compiler know about the new domain or should we create a subtask for that?

Does puppet compiler know about the new domain or should we create a subtask for that?

we probably need a subtask.

Attached email has now been sent to cloud-announce

?

I was just doing that because the proposed timeline is today

Whoops, sorry. I'll remember that for next time

?

I was just doing that because the proposed timeline is today

I believe the attached email says September 8th, which is a week from now.

I believe the attached email says September 8th, which is a week from now.

Oh right. I was just going off the task description

Change 624772 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs nova fullstack test: expect new VMs under .eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/624772

Change 624772 merged by Andrew Bogott:
[operations/puppet@production] wmcs nova fullstack test: expect new VMs under .eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/624772

Change 620937 merged by Andrew Bogott:
[operations/puppet@production] designate: stop creating 'legacy' entries (that is, things under wmflabs)

https://gerrit.wikimedia.org/r/620937

Change 620936 merged by Andrew Bogott:
[operations/puppet@production] Nova/Neutron: set dhcp_domain to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/620936

Change 625969 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labspuppetbackend: support requests for VMs/prefixes under .wmflabs

https://gerrit.wikimedia.org/r/625969

Change 625976 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[cloud/instance-puppet@master] Huge rename of node definitions

https://gerrit.wikimedia.org/r/625976

Change 625969 merged by Andrew Bogott:
[operations/puppet@production] labspuppetbackend: support requests for VMs/prefixes under .wmflabs

https://gerrit.wikimedia.org/r/625969

Change 625976 merged by Andrew Bogott:
[cloud/instance-puppet@master] Huge rename of node definitions

https://gerrit.wikimedia.org/r/625976

Mentioned in SAL (#wikimedia-cloud) [2020-09-08T21:48:11Z] <bd808> Renamed FQDN prefixes to wikimedia.cloud scheme in cloudinfra-db01's labspuppet db (T260614)

Change 625980 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labspuppetbackend: fix support for VMs/prefixes under .wmflabs

https://gerrit.wikimedia.org/r/625980

Change 625980 merged by Andrew Bogott:
[operations/puppet@production] labspuppetbackend: fix support for VMs/prefixes under .wmflabs

https://gerrit.wikimedia.org/r/625980

Mentioned in SAL (#wikimedia-cloud) [2020-09-08T21:48:11Z] <bd808> Renamed FQDN prefixes to wikimedia.cloud scheme in cloudinfra-db01's labspuppet db (T260614)

This looked something like:

MariaDB [labspuppet]> create table prefix_20200908 select * from prefix;
MariaDB [labspuppet]> UPDATE prefix SET prefix = REPLACE(prefix, '.eqiad.wmflabs', '.eqiad1.wikimedia.cloud') WHERE prefix LIKE '%.eqiad.wmflabs';

The was a bit more to it as one instance had duplicate records under the old and new FQDN schemes. @Andrew verified that they were the same and I deleted the legacy prefix and its data across all the tables.

Change 625983 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/wmf-puppet-dashboard@master] Update default VM prefix to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/625983

Change 625983 merged by Andrew Bogott:
[openstack/horizon/wmf-puppet-dashboard@master] Update default VM prefix to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/625983

Change 625985 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/wmf-puppet-dashboard@train] Update default VM prefix to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/625985

Change 625985 merged by Andrew Bogott:
[openstack/horizon/wmf-puppet-dashboard@train] Update default VM prefix to eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/625985

Change 625986 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[openstack/horizon/deploy@train] Update wmf-puppet-dashboard submodule

https://gerrit.wikimedia.org/r/625986

Change 625986 merged by Andrew Bogott:
[openstack/horizon/deploy@train] Update wmf-puppet-dashboard submodule

https://gerrit.wikimedia.org/r/625986

Change 625992 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] labspuppetbackend: rearrange args to re.sub

https://gerrit.wikimedia.org/r/625992

Change 625992 merged by Andrew Bogott:
[operations/puppet@production] labspuppetbackend: rearrange args to re.sub

https://gerrit.wikimedia.org/r/625992

Hello. Does this affect email aliases such as $ldap_shell_name@tools.wmflabs.org and/or aliases to tools and maintainers? If so, which TLD is being used now? Thanks.

Hello. Does this affect email aliases such as $ldap_shell_name@tools.wmflabs.org and/or aliases to tools and maintainers? If so, which TLD is being used now? Thanks.

No as this only affects .wmflabs and not wmflabs.org

Change 626207 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[cloud/instance-puppet@master] Further rename of node definitions

https://gerrit.wikimedia.org/r/626207

Change 626207 merged by Andrew Bogott:
[cloud/instance-puppet@master] Further rename of node definitions

https://gerrit.wikimedia.org/r/626207

Change 626213 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs-k8s-node-upgrade.py: minor usage edit

https://gerrit.wikimedia.org/r/626213

Change 626214 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs-package-build.py: update default hosts to use .wikimedia.cloud

https://gerrit.wikimedia.org/r/626214

Change 626215 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Hiera: replace some commented refs to .eqiad.wmflabs

https://gerrit.wikimedia.org/r/626215

Change 626217 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] designate.conf: update comments

https://gerrit.wikimedia.org/r/626217

Change 626213 merged by Andrew Bogott:
[operations/puppet@production] wmcs-k8s-node-upgrade.py: minor usage edit

https://gerrit.wikimedia.org/r/626213

Change 626214 merged by Andrew Bogott:
[operations/puppet@production] wmcs-package-build.py: update default hosts to use .wikimedia.cloud

https://gerrit.wikimedia.org/r/626214

Change 626215 merged by Andrew Bogott:
[operations/puppet@production] Hiera: replace some commented refs to .eqiad.wmflabs

https://gerrit.wikimedia.org/r/626215

Change 626217 merged by Andrew Bogott:
[operations/puppet@production] designate.conf: update comments

https://gerrit.wikimedia.org/r/626217

Change 626221 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wikireplica_dns.yaml: add .eqiad1.wikimedia.cloud cnames

https://gerrit.wikimedia.org/r/626221

Change 626221 abandoned by Andrew Bogott:
[operations/puppet@production] wikireplica_dns.yaml: add .eqiad1.wikimedia.cloud cnames

Reason:
We may use actual new backend hosts when we adopt the .eqiad1.wikimedia.cloud standard.

https://gerrit.wikimedia.org/r/626221

Not sure if @Andrew is aware of this issue, but posting here for the record anyway.

We will need to tune the /etc/resolv.conf files for all (or some) VMs. Here is the thing I discovered.

  • I introduced new k8s worker nodes with the new FQDN: tools-k8s-ingress-1.tools.eqiad1.wikimedia.cloud and tools-k8s-ingress-2.tools.eqiad1.wikimedia.cloud.
  • tools-prometheus-04 autogenerates the list of nodes to query from the k8s API, and tries to contact them using only the short hostname, ie: tools-k8s-ingress-1.
  • tools-prometheus-04 only has search tools.eqiad.wmflabs eqiad.wmflabs in /etc/resolv.conf, and therefore searches for tools-k8s-ingress-1.tools.eqiad.wmflabs which doesn't exist.
  • this is controlled in puppet modules/base/manifests/resolving.pp but the puppet code looks ugly and might worth refactoring a bit?

Change 626401 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] toolforge_canary_list.txt: use new .eqiad1.wikimedia.cloud names

https://gerrit.wikimedia.org/r/626401

Change 626450 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] wmcs pdns recursors: add zone forwarding for .cloud lookups

https://gerrit.wikimedia.org/r/626450

Change 626450 merged by Andrew Bogott:
[operations/puppet@production] wmcs pdns recursors: add zone forwarding for .cloud lookups

https://gerrit.wikimedia.org/r/626450

Change 626457 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] toolforge_canary_list.txt: use new .eqiad1.wikimedia.cloud names

https://gerrit.wikimedia.org/r/626457

Change 626401 abandoned by Andrew Bogott:
[operations/puppet@production] toolforge_canary_list.txt: use new .eqiad1.wikimedia.cloud names

Reason:

https://gerrit.wikimedia.org/r/626401

Change 626464 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] tools-clush-generator: use eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/626464

Change 626465 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] scap: add support for .eqiad1.wikimedia.cloud targets

https://gerrit.wikimedia.org/r/626465

Change 626466 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] trafficserver: update to use a .wikimedia.cloud dns name

https://gerrit.wikimedia.org/r/626466

Change 626467 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] base::remote_syslog: use .wikimedia.cloud naming for deployment-prep

https://gerrit.wikimedia.org/r/626467

Change 626468 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] toolschecker: use .eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/626468

Change 626465 merged by Andrew Bogott:
[operations/puppet@production] scap: add support for .eqiad1.wikimedia.cloud targets

https://gerrit.wikimedia.org/r/626465

Change 626464 merged by Andrew Bogott:
[operations/puppet@production] tools-clush-generator: use eqiad1.wikimedia.cloud

https://gerrit.wikimedia.org/r/626464

Change 626457 merged by Andrew Bogott:
[operations/puppet@production] toolforge_canary_list.txt: use new .eqiad1.wikimedia.cloud names

https://gerrit.wikimedia.org/r/626457

Change 626468 abandoned by Andrew Bogott:
[operations/puppet@production] toolschecker: use .eqiad1.wikimedia.cloud

Reason:
we aren't changing the db service names yet

https://gerrit.wikimedia.org/r/626468

Change 626467 merged by Andrew Bogott:
[operations/puppet@production] base::remote_syslog: use .wikimedia.cloud naming for deployment-prep

https://gerrit.wikimedia.org/r/626467

Change 626466 merged by Andrew Bogott:
[operations/puppet@production] trafficserver: update to use a .wikimedia.cloud dns name

https://gerrit.wikimedia.org/r/626466