Page MenuHomePhabricator

Test haproxy as a WMF's CDN TLS terminator with real traffic
Closed, ResolvedPublic

Description

Haproxy is one of the candidates to replace ats-tls as the TLS terminator used in the WMF caching infrastructure. To fully validate its performance and stability a real traffic test is needed.
To be able to perform this test several requirements need to be fulfilled:

Details

SubjectRepoBranchLines +/-
operations/puppetproduction+25 -0
operations/puppetproduction+1 -5
operations/puppetproduction+0 -930
operations/puppetproduction+11 -4
operations/puppetproduction+11 -5
operations/puppetproduction+11 -4
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+13 -3
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+11 -6
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/alertsmaster+16 -0
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+16 -2
operations/puppetproduction+12 -2
operations/puppetproduction+17 -1
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+11 -5
operations/puppetproduction+12 -2
operations/puppetproduction+12 -17
operations/puppetproduction+12 -2
operations/puppetproduction+2 -0
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+4 -0
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+12 -2
operations/puppetproduction+10 -8
operations/puppetproduction+10 -8
operations/puppetproduction+7 -7
operations/puppetproduction+19 -5
operations/puppetproduction+1 -0
operations/puppetproduction+11 -9
operations/puppetproduction+10 -8
operations/puppetproduction+10 -8
operations/puppetproduction+10 -8
operations/puppetproduction+10 -8
operations/puppetproduction+10 -8
operations/puppetproduction+10 -13
operations/puppetproduction+37 -0
operations/puppetproduction+5 -7
operations/puppetproduction+100 -0
operations/puppetproduction+13 -0
operations/puppetproduction+16 -14
operations/puppetproduction+4 -1
operations/puppetproduction+1 -1
operations/puppetproduction+24 -0
operations/puppetproduction+30 -13
operations/puppetproduction+1 -1
operations/puppetproduction+16 -1
operations/puppetproduction+108 -43
operations/puppetproduction+62 -6
operations/puppetproduction+8 -1
operations/puppetproduction+8 -1
operations/puppetproduction+2 -0
operations/puppetproduction+1 -1
operations/puppetproduction+1 -10
operations/puppetproduction+5 -1
operations/puppetproduction+4 -2
operations/puppetproduction+1 -0
operations/puppetproduction+7 -0
operations/puppetproduction+9 -1
operations/puppetproduction+1 -1
operations/puppetproduction+15 -0
operations/puppetproduction+20 -0
operations/puppetproduction+9 -1
operations/puppetproduction+286 -0
operations/puppetproduction+2 -1
operations/puppetproduction+2 -1
operations/puppetproduction+1 -1
operations/puppetproduction+2 -1
operations/puppetproduction+120 -10
operations/puppetproduction+9 -1
operations/puppetproduction+0 -48
operations/puppetproduction+9 -1
operations/puppetproduction+5 -4
operations/puppetproduction+7 -0
operations/puppetproduction+2 -2
operations/puppetproduction+10 -0
operations/puppetproduction+9 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -1
operations/puppetproduction+53 -5
operations/puppetproduction+64 -2
operations/puppetproduction+1 -1
operations/puppetproduction+9 -1
operations/puppetproduction+7 -1
operations/puppetproduction+2 -2
operations/puppetproduction+1 -0
operations/puppetproduction+20 -0
operations/puppetproduction+6 -6
operations/puppetproduction+2 -1
operations/puppetproduction+4 -0
operations/puppetproduction+7 -1
operations/puppetproduction+20 -1
operations/puppetproduction+1 -1
operations/puppetproduction+1 -0
operations/puppetproduction+23 -0
operations/puppetproduction+266 -0
operations/puppetproduction+23 -1
operations/puppetproduction+3 -2
operations/puppetproduction+2 -0
operations/puppetproduction+6 -1
operations/puppetproduction+16 -0
operations/puppetproduction+1 -1
operations/puppetproduction+121 -0
operations/puppetproduction+10 -0
operations/puppetproduction+34 -0
operations/puppetproduction+12 -0
operations/puppetproduction+20 -0
operations/puppetproduction+46 -0
operations/puppetproduction+56 -1
operations/puppetproduction+150 -7
operations/puppetproduction+31 -1
operations/puppetproduction+96 -0
operations/puppetproduction+41 -5
operations/puppetproduction+41 -5
operations/puppetproduction+16 -10
operations/puppetproduction+15 -6
operations/puppetproduction+8 -23
Show related patches Customize query in gerrit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6014.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3050.esams.wmnet with OS buster completed:

  • cp3050 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204070755_mmandere_65012_cp3050.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T09:01:11Z] <mmandere> pool cp3050 with HAProxy as TLS termination layer - T290005

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6014.drmrs.wmnet with OS buster completed:

  • cp6014 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204070809_mmandere_67914_cp6014.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T09:20:12Z] <mmandere> pool cp6014 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-07T09:25:26Z] <mmandere> depool cp3053 for reimage - T290005

Change 777848 merged by MMandere:

[operations/puppet@production] site: Reimage cp3053 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/777848

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp3053.esams.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-07T09:34:22Z] <mmandere> depool cp6006 for reimage - T290005

Change 777849 merged by MMandere:

[operations/puppet@production] site: Reimage cp6006 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/777849

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6006.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6006.drmrs.wmnet with OS buster completed:

  • cp6006 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204070939_mmandere_121256_cp6006.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T10:40:26Z] <mmandere> pool cp6006 with HAProxy as TLS termination layer - T290005

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3053.esams.wmnet with OS buster completed:

  • cp3053 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204070933_mmandere_114086_cp3053.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T10:59:34Z] <mmandere> pool cp3053 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-07T11:23:30Z] <mmandere> depool cp3051 for reimage - T290005

Change 777850 merged by MMandere:

[operations/puppet@production] site: Reimage cp3051 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/777850

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp3051.esams.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-07T11:35:25Z] <mmandere> depool cp6013 for reimage - T290005

Change 777851 merged by MMandere:

[operations/puppet@production] site: Reimage cp6013 as cache::text_haproxy

https://gerrit.wikimedia.org/r/777851

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6013.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp3051.esams.wmnet with OS buster completed:

  • cp3051 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204071133_mmandere_230902_cp3051.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T12:32:22Z] <mmandere> pool cp3051 with HAProxy as TLS termination layer - T290005

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6013.drmrs.wmnet with OS buster completed:

  • cp6013 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204071145_mmandere_243882_cp6013.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-traffic) [2022-04-07T12:54:59Z] <mmandere> pool cp6013 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-07T12:58:56Z] <mmandere> depool cp6005 for reimage - T290005

Change 777852 merged by MMandere:

[operations/puppet@production] site: Reimage cp6005 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/777852

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6005.drmrs.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-07T13:13:46Z] <mmandere> depool cp6012 for reimage - T290005

Change 777853 merged by MMandere:

[operations/puppet@production] site: Reimage cp6012 as cache::text_haproxy

https://gerrit.wikimedia.org/r/777853

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6012.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6005.drmrs.wmnet with OS buster completed:

  • cp6005 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204071311_mmandere_328202_cp6005.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T14:08:43Z] <mmandere> pool cp6005 with HAProxy as TLS termination layer - T290005

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6012.drmrs.wmnet with OS buster completed:

  • cp6012 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204071320_mmandere_335449_cp6012.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T14:13:02Z] <mmandere> pool cp6012 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-07T14:19:22Z] <mmandere> depool cp6004 for reimage - T290005

Change 777854 merged by MMandere:

[operations/puppet@production] site: Reimage cp6004 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/777854

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6004.drmrs.wmnet with OS buster

Change 778300 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6011 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778300

Change 778301 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6003 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778301

Change 778302 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6010 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778302

Change 778303 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6002 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778303

Change 778304 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6009 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778304

Change 778305 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6001 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778305

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6004.drmrs.wmnet with OS buster completed:

  • cp6004 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204071424_mmandere_400225_cp6004.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-07T15:21:32Z] <mmandere> pool cp6004 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-08T07:12:40Z] <mmandere> depool cp6011 for reimage - T290005

Change 778300 merged by MMandere:

[operations/puppet@production] site: Reimage cp6011 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778300

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6011.drmrs.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-08T07:21:47Z] <mmandere> depool cp6003 for reimage - T290005

Change 778301 merged by MMandere:

[operations/puppet@production] site: Reimage cp6003 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778301

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6003.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6011.drmrs.wmnet with OS buster completed:

  • cp6011 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204080720_mmandere_671714_cp6011.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-08T08:26:34Z] <mmandere> pool cp6011 with HAProxy as TLS termination layer - T290005

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6003.drmrs.wmnet with OS buster completed:

  • cp6003 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204080728_mmandere_672261_cp6003.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-08T08:41:55Z] <mmandere> pool cp6003 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-08T08:48:03Z] <mmandere> depool cp6010 for reimage - T290005

Change 778302 merged by MMandere:

[operations/puppet@production] site: Reimage cp6010 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778302

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6010.drmrs.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-08T09:02:10Z] <mmandere> depool cp6002 for reimage - T290005

Change 778303 merged by MMandere:

[operations/puppet@production] site: Reimage cp6002 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778303

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6002.drmrs.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6002.drmrs.wmnet with OS buster completed:

  • cp6002 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204080913_mmandere_692372_cp6002.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6010.drmrs.wmnet with OS buster completed:

  • cp6010 (WARN)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204080856_mmandere_689142_cp6010.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-08T10:11:20Z] <mmandere> pool cp6010 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-08T10:18:40Z] <mmandere> pool cp6002 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-08T11:11:21Z] <mmandere> depool cp6009 for reimage - T290005

Change 778490 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6009 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778490

Change 778491 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] site: Reimage cp6001 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778491

Change 778304 merged by MMandere:

[operations/puppet@production] site: Reimage cp6009 as cache::text_haproxy

https://gerrit.wikimedia.org/r/778304

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6009.drmrs.wmnet with OS buster

Mentioned in SAL (#wikimedia-operations) [2022-04-08T12:15:45Z] <mmandere> depool cp6001 for reimage - T290005

Change 778305 merged by MMandere:

[operations/puppet@production] site: Reimage cp6001 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/778305

Cookbook cookbooks.sre.hosts.reimage was started by mmandere@cumin1001 for host cp6001.drmrs.wmnet with OS buster

Change 778490 abandoned by MMandere:

[operations/puppet@production] site: Reimage cp6009 as cache::text_haproxy

Reason:

Conflicts with already merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/778304

https://gerrit.wikimedia.org/r/778490

Change 778491 abandoned by MMandere:

[operations/puppet@production] site: Reimage cp6001 as cache::upload_haproxy

Reason:

Conflicts with already merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/778305

https://gerrit.wikimedia.org/r/778491

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6009.drmrs.wmnet with OS buster completed:

  • cp6009 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204081211_mmandere_720511_cp6009.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Cookbook cookbooks.sre.hosts.reimage started by mmandere@cumin1001 for host cp6001.drmrs.wmnet with OS buster completed:

  • cp6001 (PASS)
    • Downtimed on Icinga/Alertmanager
    • Disabled Puppet
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • Removed previous downtime on Alertmanager (old OS)
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202204081222_mmandere_721438_cp6001.out
    • Checked BIOS boot parameters are back to normal
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is optimal
    • Icinga downtime removed
    • Updated Netbox data from PuppetDB

Mentioned in SAL (#wikimedia-operations) [2022-04-08T13:16:01Z] <mmandere> pool cp6009 with HAProxy as TLS termination layer - T290005

Mentioned in SAL (#wikimedia-operations) [2022-04-08T13:20:54Z] <mmandere> pool cp6001 with HAProxy as TLS termination layer - T290005

Change 778989 had a related patch set uploaded (by MMandere; author: MMandere):

[operations/puppet@production] cache::varnish: Merge repeating host data to site data

https://gerrit.wikimedia.org/r/778989

Change 778989 merged by MMandere:

[operations/puppet@production] cache::varnish: Merge repeating host data to common data

https://gerrit.wikimedia.org/r/778989

Change 788893 had a related patch set uploaded (by Ssingh; author: Ssingh):

[operations/puppet@production] site: Reimage cp5002 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/788893

Change 788893 merged by Ssingh:

[operations/puppet@production] site: Reimage cp5002 as cache::upload_haproxy

https://gerrit.wikimedia.org/r/788893

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster executed with errors:

  • cp5002 (FAIL)
    • Downtimed on Icinga/Alertmanager
    • Set pooled=inactive for the following services on confctl:

{"cp5002.eqsin.wmnet": {"weight": 100, "pooled": "no"}, "tags": "dc=eqsin,cluster=cache_upload,service=ats-be"}
{"cp5002.eqsin.wmnet": {"weight": 1, "pooled": "no"}, "tags": "dc=eqsin,cluster=cache_upload,service=ats-tls"}
{"cp5002.eqsin.wmnet": {"weight": 1, "pooled": "no"}, "tags": "dc=eqsin,cluster=cache_upload,service=varnish-fe"}

  • Disabled Puppet
  • Removed from Puppet and PuppetDB if present
  • Deleted any existing Puppet certificate
  • Removed from Debmonitor if present
  • Forced PXE for next reboot
  • Host rebooted via IPMI
  • Host up (Debian installer)
  • Services in confctl are not automatically pooled, to restore the previous state you have to run the following commands:

sudo confctl select 'dc=eqsin,cluster=cache_upload,service=ats-be' set/pooled=no
sudo confctl select 'dc=eqsin,cluster=cache_upload,service=ats-tls' set/pooled=no
sudo confctl select 'dc=eqsin,cluster=cache_upload,service=varnish-fe' set/pooled=no

  • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster executed with errors:

  • cp5002 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster executed with errors:

  • cp5002 (FAIL)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • The reimage failed, see the cookbook logs for the details

Cookbook cookbooks.sre.hosts.reimage was started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster

Cookbook cookbooks.sre.hosts.reimage started by sukhe@cumin2002 for host cp5002.eqsin.wmnet with OS buster completed:

  • cp5002 (WARN)
    • Removed from Puppet and PuppetDB if present
    • Deleted any existing Puppet certificate
    • Removed from Debmonitor if present
    • Forced PXE for next reboot
    • Host rebooted via IPMI
    • Host up (Debian installer)
    • Host up (new fresh buster OS)
    • Generated Puppet certificate
    • Signed new Puppet certificate
    • Run Puppet in NOOP mode to populate exported resources in PuppetDB
    • Found Nagios_host resource for this host in PuppetDB
    • Downtimed the new host on Icinga/Alertmanager
    • First Puppet run completed and logged in /var/log/spicerack/sre/hosts/reimage/202205041444_sukhe_1735192_cp5002.out
    • Checked BIOS boot parameters are back to normal
    • configmaster.wikimedia.org updated with the host new SSH public key for wmf-update-known-hosts-production
    • Rebooted
    • Automatic Puppet run was successful
    • Forced a re-check of all Icinga services for the host
    • Icinga status is not optimal, downtime not removed
    • Updated Netbox data from PuppetDB
Vgutierrez claimed this task.