Page MenuHomePhabricator

Migrate ldap/corp replicas to Stretch/Buster
Closed, ResolvedPublic

Description

dubnium.wikimedia.org and pollux.wikimedia.org are the OpenLDAP replicas of the OIT directory. Given that these are Ganeti instances and only have a single user it's probably best to create a new cluster, test it and then switch the MXes to use it instead of dubnium/pollux. Let's maybe also use a name like ldapcorp1001 to make the DC naming more visible.

Event Timeline

ArielGlenn triaged this task as Medium priority.Jun 11 2019, 7:58 AM

Change 553323 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Switch ldap-corp.codfw.wikimedia.org to ldap-corp2001

https://gerrit.wikimedia.org/r/553323

Change 553323 merged by Muehlenhoff:
[operations/dns@master] Switch ldap-corp.codfw.wikimedia.org to ldap-corp2001

https://gerrit.wikimedia.org/r/553323

Change 554852 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Switch ldap-corp.eqiad.wikimedia.org to ldap-corp1001

https://gerrit.wikimedia.org/r/554852

Change 554852 merged by Muehlenhoff:
[operations/dns@master] Switch ldap-corp.eqiad.wikimedia.org to ldap-corp1001

https://gerrit.wikimedia.org/r/554852

Change 555990 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Turn old LDAP replicas into spares

https://gerrit.wikimedia.org/r/555990

Change 555990 merged by Muehlenhoff:
[operations/puppet@production] Turn old LDAP replicas into spares

https://gerrit.wikimedia.org/r/555990

Mentioned in SAL (#wikimedia-operations) [2019-12-10T10:13:52Z] <moritzm> stopping slapd on dubnium/pollux following application of the spare role T224557

cookbooks.sre.hosts.decommission executed by jmm@cumin2001 for hosts: pollux.wikimedia.org

  • pollux.wikimedia.org (FAIL)
    • Downtimed host on Icinga
    • No management interface found (likely a VM)
    • Unable to connect to the host, wipe of bootloaders will not be performed: Cumin execution failed (exit_code=2)
    • Failed to shutdown, manual intervention required: Cumin execution failed (exit_code=2)
    • Set Netbox status on VM not yet supported: manual intervention required
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

ERROR: some step on some host failed, check the bolded items above

cookbooks.sre.hosts.decommission executed by jmm@cumin1001 for hosts: dubnium.wikimedia.org

  • dubnium.wikimedia.org (FAIL)
    • Downtimed host on Icinga
    • No management interface found (likely a VM)
    • Unable to connect to the host, wipe of bootloaders will not be performed: Cumin execution failed (exit_code=2)
    • Failed to shutdown, manual intervention required: Cumin execution failed (exit_code=2)
    • Set Netbox status on VM not yet supported: manual intervention required
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

ERROR: some step on some host failed, check the bolded items above

Change 560383 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/puppet@production] Remove remaining Puppet refs of old jessie LDAP corp servers

https://gerrit.wikimedia.org/r/560383

Change 560383 merged by Muehlenhoff:
[operations/puppet@production] Remove remaining Puppet refs of old jessie LDAP corp servers

https://gerrit.wikimedia.org/r/560383

This is complete. The new systems based on buster are ldap-corp1001 and ldap-corp2001 and dubnium/pollux have been decomissioned.

ayounsi subscribed.

Not sure if I'm re-opening the proper task, but looks relevant.

dubnium/pollux are still present in DNS while I don't think they should (and they don't reply to pings).

Change 597099 had a related patch set uploaded (by Muehlenhoff; owner: Muehlenhoff):
[operations/dns@master] Remove DNS entries for dubnium/pollux

https://gerrit.wikimedia.org/r/597099

Change 597099 merged by Muehlenhoff:
[operations/dns@master] Remove DNS entries for dubnium/pollux

https://gerrit.wikimedia.org/r/597099

Not sure if I'm re-opening the proper task, but looks relevant.

dubnium/pollux are still present in DNS while I don't think they should (and they don't reply to pings).

Thanks, good catch! Fixed.