Page MenuHomePhabricator

ops-monitoring-bot (Operations Monitoring Bot)
UserBot

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Aug 12 2016, 1:45 PM (189 w, 1 d)
Roles
Bot
Availability
Available
LDAP User
Unknown
MediaWiki User
Unknown

Bot managed by Operations for automated interaction with Phabricator from monitoring tools.

Recent Activity

Thu, Mar 26

ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2042.codfw.wmnet']
Thu, Mar 26, 9:41 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2041.codfw.wmnet']
Thu, Mar 26, 9:37 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2042.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003262114_pt1979_12244_cp2042_codfw_wmnet.log.

Thu, Mar 26, 9:15 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2041.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003262110_pt1979_10526_cp2041_codfw_wmnet.log.

Thu, Mar 26, 9:11 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2040.codfw.wmnet']
Thu, Mar 26, 7:45 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2039.codfw.wmnet']
Thu, Mar 26, 7:44 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2040.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261920_pt1979_29122_cp2040_codfw_wmnet.log.

Thu, Mar 26, 7:20 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2039.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261920_pt1979_29089_cp2039_codfw_wmnet.log.

Thu, Mar 26, 7:20 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2038.codfw.wmnet']
Thu, Mar 26, 7:20 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2037.codfw.wmnet']
Thu, Mar 26, 7:10 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2038.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261857_pt1979_23424_cp2038_codfw_wmnet.log.

Thu, Mar 26, 6:57 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2037.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261847_pt1979_22587_cp2037_codfw_wmnet.log.

Thu, Mar 26, 6:47 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2036.codfw.wmnet']
Thu, Mar 26, 4:58 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2035.codfw.wmnet']
Thu, Mar 26, 4:49 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2036.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261634_pt1979_2441_cp2036_codfw_wmnet.log.

Thu, Mar 26, 4:34 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2035.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261625_pt1979_1523_cp2035_codfw_wmnet.log.

Thu, Mar 26, 4:25 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2034.codfw.wmnet']
Thu, Mar 26, 4:24 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2033.codfw.wmnet']
Thu, Mar 26, 4:17 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2034.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261558_pt1979_27384_cp2034_codfw_wmnet.log.

Thu, Mar 26, 3:58 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2032.codfw.wmnet']
Thu, Mar 26, 3:58 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2033.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261553_pt1979_26852_cp2033_codfw_wmnet.log.

Thu, Mar 26, 3:54 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2031.codfw.wmnet']
Thu, Mar 26, 3:53 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2032.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261532_pt1979_21383_cp2032_codfw_wmnet.log.

Thu, Mar 26, 3:32 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2031.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261530_pt1979_21215_cp2031_codfw_wmnet.log.

Thu, Mar 26, 3:30 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2030.codfw.wmnet']
Thu, Mar 26, 3:25 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2029.codfw.wmnet']
Thu, Mar 26, 3:19 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2030.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261501_pt1979_14691_cp2030_codfw_wmnet.log.

Thu, Mar 26, 3:01 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2029.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261456_pt1979_14283_cp2029_codfw_wmnet.log.

Thu, Mar 26, 2:56 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2028.codfw.wmnet']
Thu, Mar 26, 2:56 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2028.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261431_pt1979_9983_cp2028_codfw_wmnet.log.

Thu, Mar 26, 2:31 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Completed auto-reimage of hosts:

['cp2027.codfw.wmnet']
Thu, Mar 26, 2:30 PM · Patch-For-Review, Operations, Traffic, ops-codfw
ops-monitoring-bot added a comment to T247340: (Need by: TBD) rack/setup/install cp202[7-9], cp203[0-9], cp204[0-2].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

cp2027.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003261358_pt1979_4873_cp2027_codfw_wmnet.log.

Thu, Mar 26, 1:58 PM · Patch-For-Review, Operations, Traffic, ops-codfw

Wed, Mar 25

ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Completed auto-reimage of hosts:

['db2115.codfw.wmnet']
Wed, Mar 25, 2:13 PM · DBA
ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db2115.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003251336_marostegui_119447.log.

Wed, Mar 25, 1:36 PM · DBA
ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Completed auto-reimage of hosts:

['db2115.codfw.wmnet']
Wed, Mar 25, 1:04 PM · DBA
ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db2115.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003251247_marostegui_113904.log.

Wed, Mar 25, 12:47 PM · DBA
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1250-1253].eqiad.wmnet

  • mw1250.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1251.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1252.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1253.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Wed, Mar 25, 11:39 AM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1232-1235].eqiad.wmnet

  • mw1232.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1233.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1234.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1235.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Wed, Mar 25, 11:33 AM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 4 host(s) and their services with reason: decom

mw[1250-1253].eqiad.wmnet
Wed, Mar 25, 11:22 AM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 4 host(s) and their services with reason: decom

mw[1232-1235].eqiad.wmnet
Wed, Mar 25, 11:21 AM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Completed auto-reimage of hosts:

['db1137.eqiad.wmnet']
Wed, Mar 25, 8:59 AM · DBA
ops-monitoring-bot added a comment to T246604: Install 1 buster+10.4 host per section.

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1137.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003250839_marostegui_81114.log.

Wed, Mar 25, 8:40 AM · DBA

Mon, Mar 23

ops-monitoring-bot added a comment to T149418: Deploy gtid_domain_id flag in our mysql hosts.

Completed auto-reimage of hosts:

['db1077.eqiad.wmnet']
Mon, Mar 23, 12:34 PM · Patch-For-Review, DBA
ops-monitoring-bot added a comment to T149418: Deploy gtid_domain_id flag in our mysql hosts.

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['db1077.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003231208_marostegui_34283.log.

Mon, Mar 23, 12:08 PM · Patch-For-Review, DBA

Fri, Mar 20

ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1248-1249].eqiad.wmnet

  • mw1248.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1249.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Fri, Mar 20, 9:06 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1244-1247].eqiad.wmnet

  • mw1244.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1245.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1246.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1247.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Fri, Mar 20, 9:01 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1230-1231].eqiad.wmnet

  • mw1230.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1231.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Fri, Mar 20, 8:57 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1227-1229].eqiad.wmnet

  • mw1227.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1228.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1229.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Fri, Mar 20, 8:55 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 6 host(s) and their services with reason: decom

mw[1244-1249].eqiad.wmnet
Fri, Mar 20, 8:40 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 2 host(s) and their services with reason: decom

mw[1230-1231].eqiad.wmnet
Fri, Mar 20, 8:40 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 3 host(s) and their services with reason: decom

mw[1227-1229].eqiad.wmnet
Fri, Mar 20, 8:40 PM · Patch-For-Review, serviceops, Operations

Wed, Mar 18

ops-monitoring-bot added a comment to T188544: compile/diff catalogs between puppetdb v2 (production) and puppetdb v4.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: elnath.codfw.wmnet

  • elnath.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.codfw.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
Wed, Mar 18, 7:21 PM · Puppet, Operations
ops-monitoring-bot added a comment to T247787: investigate pc1008 for possible hardware issues / performance under high load.

Completed auto-reimage of hosts:

['pc1008.eqiad.wmnet']
Wed, Mar 18, 8:16 AM · Wikimedia-Incident, DBA, Operations
ops-monitoring-bot created T247920: Degraded RAID on pc1008.
Wed, Mar 18, 7:25 AM · ops-eqiad, Operations
ops-monitoring-bot added a comment to T247787: investigate pc1008 for possible hardware issues / performance under high load.

Script wmf-auto-reimage was launched by marostegui on cumin1001.eqiad.wmnet for hosts:

['pc1008.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202003180654_marostegui_208849.log.

Wed, Mar 18, 6:54 AM · Wikimedia-Incident, DBA, Operations

Tue, Mar 17

ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw1240.eqiad.wmnet

  • mw1240.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Tue, Mar 17, 6:45 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1240-1243].eqiad.wmnet

  • mw1240.eqiad.wmnet (FAIL)
    • Host steps raised exception: Empty Management Password
  • mw1241.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1242.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1243.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

ERROR: some step on some host failed, check the bolded items above

Tue, Mar 17, 6:41 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1238-1239].eqiad.wmnet

  • mw1238.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1239.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Tue, Mar 17, 6:38 PM · Patch-For-Review, serviceops, Operations

Mon, Mar 16

ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw[1222-1226].eqiad.wmnet

  • mw1222.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1223.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1224.eqiad.wmnet (FAIL)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Failed to wipe bootloaders, manual intervention required to make it unbootable: Cumin execution failed (exit_code=2)
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1225.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
  • mw1226.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB

ERROR: some step on some host failed, check the bolded items above

Mon, Mar 16, 8:46 PM · Patch-For-Review, serviceops, Operations
ops-monitoring-bot added a comment to T247780: decom old appservers in eqiad.

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw1221.eqiad.wmnet

  • mw1221.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Mon, Mar 16, 8:37 PM · Patch-For-Review, serviceops, Operations

Fri, Mar 13

ops-monitoring-bot added a comment to T247018: codfw: decom at least 15 appservers(mw2158 through mw2172) in codfw rack C3 to make room for new servers.

Icinga downtime for 12:00:00 set by dzahn@cumin1001 on 15 host(s) and their services with reason: decom

mw[2158-2172].codfw.wmnet
Fri, Mar 13, 12:04 AM · Patch-For-Review, Operations, ops-codfw, serviceops

Thu, Mar 12

ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2024.codfw.wmnet']
Thu, Mar 12, 7:25 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2024.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121901_pt1979_17369_ganeti2024_codfw_wmnet.log.

Thu, Mar 12, 7:02 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2023.codfw.wmnet']
Thu, Mar 12, 6:58 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2022.codfw.wmnet']
Thu, Mar 12, 6:47 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2023.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121834_pt1979_10355_ganeti2023_codfw_wmnet.log.

Thu, Mar 12, 6:35 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2022.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121824_pt1979_9092_ganeti2022_codfw_wmnet.log.

Thu, Mar 12, 6:24 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2021.codfw.wmnet']
Thu, Mar 12, 5:40 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2021.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121716_pt1979_30754_ganeti2021_codfw_wmnet.log.

Thu, Mar 12, 5:16 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2020.codfw.wmnet']
Thu, Mar 12, 4:47 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2020.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121621_pt1979_21080_ganeti2020_codfw_wmnet.log.

Thu, Mar 12, 4:21 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2019.codfw.wmnet']
Thu, Mar 12, 4:21 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T245567: (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121611_cmjohnson_213880_htmldumper1001_eqiad_wmnet.log.

Thu, Mar 12, 4:11 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2019.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121555_pt1979_15261_ganeti2019_codfw_wmnet.log.

Thu, Mar 12, 3:55 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Completed auto-reimage of hosts:

['ganeti2019.codfw.wmnet']
Thu, Mar 12, 3:47 PM · ops-codfw, Operations, DC-Ops
ops-monitoring-bot added a comment to T245567: (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121523_cmjohnson_205338_htmldumper1001_eqiad_wmnet.log.

Thu, Mar 12, 3:23 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T245567: (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..

Completed auto-reimage of hosts:

['htmldumper1001.eqiad.wmnet']
Thu, Mar 12, 3:20 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T245567: (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121513_cmjohnson_203633_htmldumper1001_eqiad_wmnet.log.

Thu, Mar 12, 3:13 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T244783: (Need by: TBD) rack/setup/install ganeti20[19-24].

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

ganeti2019.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003121440_pt1979_3208_ganeti2019_codfw_wmnet.log.

Thu, Mar 12, 2:40 PM · ops-codfw, Operations, DC-Ops

Wed, Mar 11

ops-monitoring-bot added a comment to T242992: decom grafana1001.

cookbooks.sre.hosts.decommission executed by cdanis@cumin2001 for hosts: grafana1001.eqiad.wmnet

  • grafana1001.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
Wed, Mar 11, 8:53 PM · observability
ops-monitoring-bot added a comment to T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet.

Completed auto-reimage of hosts:

['sretest1002.eqiad.wmnet']
Wed, Mar 11, 8:12 PM · ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet.

Completed auto-reimage of hosts:

['sretest1001.eqiad.wmnet']
Wed, Mar 11, 8:07 PM · ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

sretest1002.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111953_cmjohnson_7989_sretest1002_eqiad_wmnet.log.

Wed, Mar 11, 7:53 PM · ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T245754: (Need by: TBD) setup/install sretest100[12].eqiad.wmnet.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

sretest1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111947_cmjohnson_7033_sretest1001_eqiad_wmnet.log.

Wed, Mar 11, 7:47 PM · ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T245567: (Need by: 2020-03-01) rack/setup/install htmldumper1001.eqiad.wmnet..

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

htmldumper1001.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111928_cmjohnson_3633_htmldumper1001_eqiad_wmnet.log.

Wed, Mar 11, 7:28 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Completed auto-reimage of hosts:

['stat1008.eqiad.wmnet']
Wed, Mar 11, 6:41 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111815_elukey_252772_stat1008_eqiad_wmnet.log.

Wed, Mar 11, 6:15 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111357_cmjohnson_207409_stat1008_eqiad_wmnet.log.

Wed, Mar 11, 1:57 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Completed auto-reimage of hosts:

['stat1008.eqiad.wmnet']
Wed, Mar 11, 1:02 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111244_elukey_194513_stat1008_eqiad_wmnet.log.

Wed, Mar 11, 12:45 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Completed auto-reimage of hosts:

['stat1008.eqiad.wmnet']
Wed, Mar 11, 12:44 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111244_elukey_194481_stat1008_eqiad_wmnet.log.

Wed, Mar 11, 12:44 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003111131_cmjohnson_179724_stat1008_eqiad_wmnet.log.

Wed, Mar 11, 11:31 AM · Operations, ops-eqiad, DC-Ops

Tue, Mar 10

ops-monitoring-bot added a comment to T240881: (Need by: 2020-03-06) rack/setup/install logstash102[6-9].eqiad.wmnet.

Completed auto-reimage of hosts:

['logstash1029.eqiad.wmnet']
Tue, Mar 10, 11:49 PM · Operations, Wikimedia-Logstash
ops-monitoring-bot added a comment to T240881: (Need by: 2020-03-06) rack/setup/install logstash102[6-9].eqiad.wmnet.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

logstash1029.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003102322_cmjohnson_59915_logstash1029_eqiad_wmnet.log.

Tue, Mar 10, 11:23 PM · Operations, Wikimedia-Logstash
ops-monitoring-bot added a comment to T247021: move all 86 new codfw appservers into production (mw2[291-2377].codfw.wmnet).

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 6 host(s) and their services with reason: new_install

mw[2366,2368,2370,2372,2374,2376].codfw.wmnet
Tue, Mar 10, 11:11 PM · serviceops, Operations
ops-monitoring-bot added a comment to T247021: move all 86 new codfw appservers into production (mw2[291-2377].codfw.wmnet).

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 27 host(s) and their services with reason: new_install

mw[2350-2376].codfw.wmnet
Tue, Mar 10, 9:29 PM · serviceops, Operations
ops-monitoring-bot added a comment to T247021: move all 86 new codfw appservers into production (mw2[291-2377].codfw.wmnet).

Icinga downtime for 2:00:00 set by dzahn@cumin1001 on 27 host(s) and their services with reason: new_install

mw[2350-2376].codfw.wmnet
Tue, Mar 10, 8:29 PM · serviceops, Operations
ops-monitoring-bot added a comment to T246472: (Need by: ASAP) rack/setup/install stat1008.

Completed auto-reimage of hosts:

['stat1008.eqiad.wmnet']
Tue, Mar 10, 7:12 PM · Operations, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T246352: (Need by: TBD) rack/setup/install wdqs101[123].eqiad.wmnet.

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

stat1008.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202003101907_cmjohnson_17764_stat1008_eqiad_wmnet.log.

Tue, Mar 10, 7:08 PM · ops-eqiad, Operations, DC-Ops