Page MenuHomePhabricator

Configure dns and puppet repositories for new drmrs datacenter
Closed, ResolvedPublic

Description

Due: Q4 FY2021

drmrs is the codename selected for our new edge DC in Marseille.
185.15.58.0/24 is the public subnet for the new site.
The datacenter numeric code is 6 (vs e.g. 3 for esams or 5 for eqsin).

https://wikitech.wikimedia.org/wiki/Infrastructure_naming_conventions should probably be updated for drmrs as well!

Details

ProjectBranchLines +/-Subject
operations/puppetproduction+10 -1
operations/puppetproduction+6 -0
operations/puppetproduction+1 -0
operations/puppetproduction+7 -2
operations/puppetproduction+14 -18
operations/puppetproduction+8 -0
operations/puppetproduction+5 -0
operations/puppetproduction+17 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -5
operations/puppetproduction+112 -3
operations/mediawiki-configmaster+7 -1
operations/puppetproduction+1 -1
operations/homer/publicmaster+5 -1
operations/puppetproduction+8 -0
operations/puppetproduction+15 -16
operations/dnsmaster+13 -0
operations/puppetproduction+13 -0
operations/dnsmaster+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+6 -1
operations/dnsmaster+0 -4
operations/dnsmaster+1 -2
operations/puppetproduction+4 -4
operations/puppetproduction+6 -0
operations/puppetproduction+1 -1
operations/puppetproduction+2 -1
operations/puppetproduction+5 -0
operations/puppetproduction+6 -1
operations/puppetproduction+5 -0
operations/dnsmaster+2 -2
operations/puppetproduction+1 -1
labs/privatemaster+6 -0
operations/puppetproduction+44 -0
operations/puppetproduction+26 -0
operations/puppetproduction+2 -0
operations/puppetproduction+21 -2
operations/puppetproduction+50 -2
operations/puppetproduction+7 -0
operations/software/pywmflibmaster+2 -1
operations/puppetproduction+4 -4
operations/puppetproduction+5 -0
operations/puppetproduction+24 -0
operations/puppetproduction+11 -3
operations/puppetproduction+2 -2
operations/puppetproduction+108 -0
operations/dnsmaster+3 -0
operations/puppetproduction+4 -0
operations/puppetproduction+6 -3
operations/puppetproduction+2 -0
operations/puppetproduction+24 -0
operations/dnsmaster+5 -1
operations/puppetproduction+1 -1
operations/puppetproduction+56 -0
operations/puppetproduction+0 -24
operations/puppetproduction+22 -0
operations/dnsmaster+12 -12
operations/dnsmaster+106 -1
operations/dnsmaster+95 -1
operations/puppetproduction+3 -0
operations/dnsmaster+78 -1
operations/puppetproduction+2 -1
operations/puppetproduction+1 -4
operations/puppetproduction+4 -1
operations/puppetproduction+1 -0
operations/puppetproduction+5 -0
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+3 -1
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+1 -0
operations/puppetproduction+5 -0
operations/puppetproduction+1 -0
operations/puppetproduction+2 -0
operations/puppetproduction+2 -0
operations/puppetproduction+3 -1
operations/puppetproduction+1 -0
operations/puppetproduction+2 -0
operations/puppetproduction+92 -5
operations/puppetproduction+2 -0
operations/puppetproduction+1 -0
operations/puppetproduction+25 -0
Show related patches Customize query in gerrit

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6015.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6014.drmrs.wmnet with OS buster completed:

  • cp6014 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111170856_mmandere_12979_cp6014.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6016.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6015.drmrs.wmnet with OS buster completed:

  • cp6015 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111170913_mmandere_31025_cp6015.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6016.drmrs.wmnet with OS buster completed:

  • cp6016 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111170939_mmandere_25408_cp6016.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Change 739553 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: define dual ganeti clusters

https://gerrit.wikimedia.org/r/739553

Change 739553 merged by BBlack:

[operations/puppet@production] drmrs: define dual ganeti clusters

https://gerrit.wikimedia.org/r/739553

Change 739584 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs ganeti: add cluster cert public keys

https://gerrit.wikimedia.org/r/739584

Change 739584 merged by BBlack:

[operations/puppet@production] drmrs ganeti: add cluster cert public keys

https://gerrit.wikimedia.org/r/739584

Change 739586 had a related patch set uploaded (by BBlack; author: BBlack):

[labs/private@master] Add dummy private keys for drmrs ganeti

https://gerrit.wikimedia.org/r/739586

Change 739588 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] ganeti6: switch to ganeti role

https://gerrit.wikimedia.org/r/739588

Change 739588 merged by BBlack:

[operations/puppet@production] ganeti6: switch to ganeti role

https://gerrit.wikimedia.org/r/739588

Change 739594 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: include netbox svc file

https://gerrit.wikimedia.org/r/739594

Change 739594 merged by BBlack:

[operations/dns@master] drmrs: include netbox svc file

https://gerrit.wikimedia.org/r/739594

Change 739757 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs lvs instances

https://gerrit.wikimedia.org/r/739757

Change 739757 merged by MMandere:

[operations/puppet@production] site: Add drmrs lvs instances

https://gerrit.wikimedia.org/r/739757

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6001.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6001.drmrs.wmnet with OS buster completed:

  • lvs6001 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181017_mmandere_20302_lvs6001.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6002.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host lvs6003.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6002.drmrs.wmnet with OS buster completed:

  • lvs6002 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181105_mmandere_29884_lvs6002.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host lvs6003.drmrs.wmnet with OS buster completed:

  • lvs6003 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202111181127_mmandere_31664_lvs6003.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
    • Updated Netbox status planned -> staged

Change 747856 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs bastion host

https://gerrit.wikimedia.org/r/747856

Change 747856 merged by MMandere:

[operations/puppet@production] site: Add drmrs bastion host

https://gerrit.wikimedia.org/r/747856

Change 748125 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bast6001: set dhcp macaddr for ganeti vm

https://gerrit.wikimedia.org/r/748125

Change 748125 merged by BBlack:

[operations/puppet@production] bast6001: set dhcp macaddr for ganeti vm

https://gerrit.wikimedia.org/r/748125

Change 748151 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bast6001: add to bastion_hosts

https://gerrit.wikimedia.org/r/748151

Change 748151 merged by BBlack:

[operations/puppet@production] bast6001: add to bastion_hosts

https://gerrit.wikimedia.org/r/748151

Change 748174 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: add site.pp entry

https://gerrit.wikimedia.org/r/748174

Change 748175 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: use for drmrs installs

https://gerrit.wikimedia.org/r/748175

Change 748178 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] install6001: use as proxy for drmrs

https://gerrit.wikimedia.org/r/748178

Change 748174 merged by BBlack:

[operations/puppet@production] install6001: add site.pp entry

https://gerrit.wikimedia.org/r/748174

Change 748182 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] install6001: dhcp entry

https://gerrit.wikimedia.org/r/748182

Change 748182 merged by BBlack:

[operations/puppet@production] install6001: dhcp entry

https://gerrit.wikimedia.org/r/748182

Change 748175 merged by BBlack:

[operations/puppet@production] install6001: use for drmrs installs

https://gerrit.wikimedia.org/r/748175

Change 748178 merged by BBlack:

[operations/dns@master] install6001: use as proxy for drmrs

https://gerrit.wikimedia.org/r/748178

Change 748215 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: remove fake prometheus6001 dns entry

https://gerrit.wikimedia.org/r/748215

Change 748215 merged by BBlack:

[operations/dns@master] drmrs: remove fake prometheus6001 dns entry

https://gerrit.wikimedia.org/r/748215

Change 748224 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] prometheus6001: macaddr and site.pp

https://gerrit.wikimedia.org/r/748224

Change 748225 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] prometheus6001: add to global node list

https://gerrit.wikimedia.org/r/748225

Change 748224 merged by BBlack:

[operations/puppet@production] prometheus6001: macaddr and site.pp

https://gerrit.wikimedia.org/r/748224

Change 748225 merged by BBlack:

[operations/puppet@production] prometheus6001: add to global node list

https://gerrit.wikimedia.org/r/748225

Change 748227 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] Add prometheus.svc.drmrs.wmnet alias

https://gerrit.wikimedia.org/r/748227

Change 748227 merged by BBlack:

[operations/dns@master] Add prometheus.svc.drmrs.wmnet alias

https://gerrit.wikimedia.org/r/748227

Change 748228 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add drmrs prometheus to various global config

https://gerrit.wikimedia.org/r/748228

Change 748228 merged by BBlack:

[operations/puppet@production] Add drmrs prometheus to various global config

https://gerrit.wikimedia.org/r/748228

Change 748728 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/dns@master] drmrs: include Netbox files for LVS subnets

https://gerrit.wikimedia.org/r/748728

Change 748728 merged by BBlack:

[operations/dns@master] drmrs: include Netbox files for LVS subnets

https://gerrit.wikimedia.org/r/748728

Change 748746 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: configure ats-tls params

https://gerrit.wikimedia.org/r/748746

Change 748747 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] cloudgw: add newly-allocated drmrs IPs

https://gerrit.wikimedia.org/r/748747

Change 748752 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: configure lvs and public IPs

https://gerrit.wikimedia.org/r/748752

Change 748757 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: add to global datacenter list

https://gerrit.wikimedia.org/r/748757

Change 748746 merged by BBlack:

[operations/puppet@production] drmrs: configure ats-tls params

https://gerrit.wikimedia.org/r/748746

Change 748747 merged by BBlack:

[operations/puppet@production] cloudgw: add newly-allocated drmrs IPs

https://gerrit.wikimedia.org/r/748747

Change 748775 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/homer/public@master] Add drmrs addresses

https://gerrit.wikimedia.org/r/748775

Change 748790 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: ncredir puppetization

https://gerrit.wikimedia.org/r/748790

Change 748775 merged by jenkins-bot:

[operations/homer/public@master] Add drmrs addresses

https://gerrit.wikimedia.org/r/748775

Change 751952 had a related patch set uploaded (by Majavah; author: Majavah):

[operations/mediawiki-config@master] reverse-proxy: add drmrs ranges

https://gerrit.wikimedia.org/r/751952

Change 751952 merged by jenkins-bot:

[operations/mediawiki-config@master] reverse-proxy: add drmrs ranges

https://gerrit.wikimedia.org/r/751952

Mentioned in SAL (#wikimedia-operations) [2022-01-11T14:25:36Z] <taavi@deploy1002> Synchronized wmf-config/reverse-proxy.php: Config: [[gerrit:751952|reverse-proxy: add drmrs ranges (T282787)]] (duration: 01m 36s)

Change 748752 merged by MMandere:

[operations/puppet@production] drmrs: lvs/cp puppetization

https://gerrit.wikimedia.org/r/748752

Change 756613 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Add drmrs ncredir host

https://gerrit.wikimedia.org/r/756613

Change 756627 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: connect lvs bgp to switches

https://gerrit.wikimedia.org/r/756627

Change 756627 merged by BBlack:

[operations/puppet@production] drmrs host bgp fixups

https://gerrit.wikimedia.org/r/756627

Change 756639 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] bird anycast: fix defaulting to local gateway

https://gerrit.wikimedia.org/r/756639

Change 756639 merged by BBlack:

[operations/puppet@production] bird anycast: fix defaulting to local gateway

https://gerrit.wikimedia.org/r/756639

Change 756613 merged by MMandere:

[operations/puppet@production] site: Add drmrs ncredir host

https://gerrit.wikimedia.org/r/756613

Change 756953 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] install_server: Add drmrs ncredir first instance

https://gerrit.wikimedia.org/r/756953

Change 756953 merged by MMandere:

[operations/puppet@production] install_server: Add drmrs ncredir first instance

https://gerrit.wikimedia.org/r/756953

Change 757024 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] install_server: Add drmrs ncredir second instance

https://gerrit.wikimedia.org/r/757024

Change 757024 merged by MMandere:

[operations/puppet@production] install_server: Add drmrs ncredir second instance

https://gerrit.wikimedia.org/r/757024

Change 748790 abandoned by BBlack:

[operations/puppet@production] drmrs: ncredir puppetization

Reason:

already done elsewhere

https://gerrit.wikimedia.org/r/748790

Change 748757 merged by BBlack:

[operations/puppet@production] drmrs: various minor global config

https://gerrit.wikimedia.org/r/748757

Change 760613 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add netflow6001 to kafka custom ferm

https://gerrit.wikimedia.org/r/760613

Change 760614 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] Add ops-drmrs to alertmanager config

https://gerrit.wikimedia.org/r/760614

Change 760615 had a related patch set uploaded (by BBlack; author: BBlack):

[operations/puppet@production] drmrs: add vk delivery error alerting

https://gerrit.wikimedia.org/r/760615

Change 760613 merged by BBlack:

[operations/puppet@production] Add netflow6001 to kafka custom ferm

https://gerrit.wikimedia.org/r/760613

Change 760614 merged by BBlack:

[operations/puppet@production] Add ops-drmrs to alertmanager config

https://gerrit.wikimedia.org/r/760614

Change 760615 merged by BBlack:

[operations/puppet@production] drmrs: add vk delivery error alerting

https://gerrit.wikimedia.org/r/760615

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6009.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6009.drmrs.wmnet with OS buster completed:

  • cp6009 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6009.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6009.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6009.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203151810_sukhe_1359834_cp6009.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6010.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6010.drmrs.wmnet with OS buster completed:

  • cp6010 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6010.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6010.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6010.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203152026_sukhe_1375326_cp6010.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6011.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6011.drmrs.wmnet with OS buster completed:

  • cp6011 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6011.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6011.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6011.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203160011_sukhe_1403513_cp6011.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6012.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6012.drmrs.wmnet with OS buster completed:

  • cp6012 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6012.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6012.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6012.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161108_sukhe_1482638_cp6012.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6013.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6013.drmrs.wmnet with OS buster completed:

  • cp6013 (WARN)
    • Downtimed on Icinga
    • Set pooled=inactive for the following services on confctl:

{"cp6013.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6013.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6013.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161227_sukhe_1494418_cp6013.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6014.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6014.drmrs.wmnet with OS buster completed:

  • cp6014 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6014.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6014.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6014.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161357_sukhe_1509133_cp6014.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6015.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6015.drmrs.wmnet with OS buster completed:

  • cp6015 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6015.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6015.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6015.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161446_sukhe_1518579_cp6015.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp6016.drmrs.wmnet with OS buster completed:

  • cp6016 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp6016.drmrs.wmnet": {"weight": 100, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-be"}
{"cp6016.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=varnish-fe"}
{"cp6016.drmrs.wmnet": {"weight": 1, "pooled": "yes"}, "tags": "dc=drmrs,cluster=cache_text,service=ats-tls"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Host up (new fresh buster OS)
  • Generated Puppet certificate
  • Signed new Puppet certificate
  • Run Puppet in NOOP mode to populate exported resources in PuppetDB
  • Found Nagios_host resource for this host in PuppetDB
  • Downtimed the new host on Icinga
  • Removed previous downtime on Alertmanager (old OS)
  • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202203161607_sukhe_1531550_cp6016.out
  • Checked BIOS boot parameters are back to normal
  • Rebooted
  • Automatic Puppet run was successful
  • Forced a re-check of all Icinga services for the host
  • Icinga status is not optimal, downtime not removed
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-be' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=varnish-fe' set/pooled=yes
sudo confctl select 'dc=drmrs,cluster=cache_text,service=ats-tls' set/pooled=yes

  • Updated Netbox data from PuppetDB

With the addition of the drmrs to the dns config in https://gerrit.wikimedia.org/r/c/operations/dns/+/771342 we're basically done with the task work here. There may be further commits, but they're in the normal production flow, not initial site config!

Nice work all!