Page MenuHomePhabricator

Rename and re-assign cloud dns servers
Closed, ResolvedPublic

Description

We're going to switch off labservices1001 and labservices1002 pretty soon. This is an opportunity to create new 'cloud-' name server addresses and move all the load onto cloudservices1003 and cloudservices1004.

The actual DNS content is synced between all four servers, so we can safely switch things over as DNS updates.

before

labservices1001

208.80.155.117: labservices1001.wikimedia.org, labs-ns0.wikimedia.org
208.80.155.118: labs-recursor0.wikimedia.org

labservices1002

208.80.154.12: labservices1002.wikimedia.org, labs-ns1.wikimedia.org
208.80.154.20: labs-recursor1.wikimedia.org

cloudservices1003

208.80.154.135: cloudservices1003.wikimedia.org, labs-ns2.wikimedia.org
208.80.154.143: labs-recursor2.wikimedia.org

cloudservices1004

208.80.154.11: cloudservices1004.wikimedia.org, labs-ns3.wikimedia.org
208.80.154.24: labs-recursor3.wikimedia.org

cloud vm resolv.conf:

nameserver 208.80.155.118
nameserver 208.80.154.20

after

labservices1001

208.80.155.117: labservices1001.wikimedia.org

labservices1002

208.80.154.12: labservices1002.wikimedia.org

cloudservices1003

208.80.154.135: cloudservices1003.wikimedia.org, cloud-ns0.wikimedia.org
208.80.154.143: cloud-recursor0.wikimedia.org

cloudservices1004

208.80.154.11: cloudservices1004.wikimedia.org, cloud-ns1.wikimedia.org
208.80.154.24: cloud-recursor1.wikimedia.org

cloud vm resolv.conf:

nameserver 208.80.154.143
nameserver 208.80.154.24

Event Timeline

Change 504490 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] cloud: add new cloud-ns and cloud-recursor names to cloudservices hosts

https://gerrit.wikimedia.org/r/504490

Change 504490 merged by Andrew Bogott:
[operations/dns@master] cloud: add new cloud-ns and cloud-recursor names to cloudservices hosts

https://gerrit.wikimedia.org/r/504490

Change 504572 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cloud dns: move primary services to cloud-ns0 and cloud-ns1

https://gerrit.wikimedia.org/r/504572

Change 504580 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] cloud VMS: move primary resolvers to cloud-recursor0/1

https://gerrit.wikimedia.org/r/504580

Change 504580 merged by Andrew Bogott:
[operations/puppet@production] cloud VMS: move primary resolvers to cloud-recursor0/1

https://gerrit.wikimedia.org/r/504580

Most VMs are now using the new recursors. 120+ had broken puppet and so didn't pick up the change... I fixed puppet myself on most of them but there are still 30-some needing fixes, for which I've opened a bunch of bugs. Worst case I guess we can just sed the new IP into each of the broken ones.

Remaining steps are...

  • merge https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/504572/
  • update markmonitor T221240 -- PENDING
  • Apply the new pool configs on cloudservices1003 and 1004 so they don't try to sync with labservices1001/1002 anymore
  • Update SOA records to use the new cloud-ns0 name (this will involve mucking in the designate database)
  • Shut down services on labservices1001/1002, test, rebuild them with role::spare, etc.

Change 504572 merged by Andrew Bogott:
[operations/puppet@production] cloud dns: move primary services to cloud-ns0 and cloud-ns1

https://gerrit.wikimedia.org/r/504572

With ripe's 2-day ttl, we should leave the old labs-ns0/1 servers up until 48 hours from this timestamp. Markmonitor will be well-since updated by then.

Change 506436 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Remove delegation of 208.80.155.128-255

https://gerrit.wikimedia.org/r/506436

Change 506436 abandoned by Andrew Bogott:
Remove delegation of 208.80.155.128-255

Reason:
dropping in favor of https://gerrit.wikimedia.org/r/#/c/operations/dns/ /505478/

https://gerrit.wikimedia.org/r/506436

Change 505478 had a related patch set uploaded (by Andrew Bogott; owner: Alex Monk):
[operations/dns@master] Remove old labs 'main' region in-addr.arpa delegation

https://gerrit.wikimedia.org/r/505478

Change 505478 merged by Andrew Bogott:
[operations/dns@master] Remove old labs 'main' region in-addr.arpa delegation

https://gerrit.wikimedia.org/r/505478

Change 506686 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/puppet@production] Temporary hack: turn off designate on labservices1001

https://gerrit.wikimedia.org/r/506686

Change 506686 merged by Andrew Bogott:
[operations/puppet@production] Temporary hack: turn off designate on labservices1001

https://gerrit.wikimedia.org/r/506686

Change 510546 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Remove labs-ns and labs-recursor names

https://gerrit.wikimedia.org/r/510546

Change 510546 merged by Andrew Bogott:
[operations/dns@master] Remove labs-ns and labs-recursor names

https://gerrit.wikimedia.org/r/510546

Change 510583 had a related patch set uploaded (by Andrew Bogott; owner: Andrew Bogott):
[operations/dns@master] Add ptr records for cloud-recursor{0,1}

https://gerrit.wikimedia.org/r/510583

Change 510583 merged by Andrew Bogott:
[operations/dns@master] Add ptr records for cloud-recursor{0,1}

https://gerrit.wikimedia.org/r/510583