Page MenuHomePhabricator
Feed Advanced Search

Yesterday

ops-monitoring-bot added a comment to T261724: cloudgw: evaluate / validate setup in codfw1dev.

Completed auto-reimage of hosts:

['labtestvirt2003.codfw.wmnet']
Tue, Sep 29, 11:35 AM · Patch-For-Review, cloud-services-team (Kanban)
ops-monitoring-bot added a comment to T261724: cloudgw: evaluate / validate setup in codfw1dev.

Script wmf-auto-reimage was launched by aborrero on cumin2001.codfw.wmnet for hosts:

labtestvirt2003.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009291118_aborrero_29762_labtestvirt2003_codfw_wmnet.log.

Tue, Sep 29, 11:18 AM · Patch-For-Review, cloud-services-team (Kanban)
ops-monitoring-bot added a comment to T261512: Provision new RESTBase/Cassandra cluster nodes: restbase1028, restbase1029, restbase1030.

Script wmf-auto-reimage was launched by hnowlan on cumin1001.eqiad.wmnet for hosts:

['restbase1028.eqiad.wmnet', 'restbase1029.eqiad.wmnet', 'restbase1030.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009291024_hnowlan_21992.log.

Tue, Sep 29, 10:25 AM · RESTBase-Cassandra, Platform Engineering, Cassandra
ops-monitoring-bot added a comment to T261512: Provision new RESTBase/Cassandra cluster nodes: restbase1028, restbase1029, restbase1030.

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet', 'restbase1029.eqiad.wmnet', 'restbase1030.eqiad.wmnet']
Tue, Sep 29, 9:58 AM · RESTBase-Cassandra, Platform Engineering, Cassandra
ops-monitoring-bot added a comment to T261512: Provision new RESTBase/Cassandra cluster nodes: restbase1028, restbase1029, restbase1030.

Script wmf-auto-reimage was launched by hnowlan on cumin1001.eqiad.wmnet for hosts:

['restbase1028.eqiad.wmnet', 'restbase1029.eqiad.wmnet', 'restbase1030.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009290943_hnowlan_14117.log.

Tue, Sep 29, 9:44 AM · RESTBase-Cassandra, Platform Engineering, Cassandra
ops-monitoring-bot created T264062: Degraded RAID on es2026.
Tue, Sep 29, 5:29 AM · Operations, ops-codfw
ops-monitoring-bot added a comment to T263740: decommission es2013.codfw.wmnet.

cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: es2013.codfw.wmnet

  • es2013.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Tue, Sep 29, 5:06 AM · Operations, DC-Ops, ops-codfw, decommission-hardware

Mon, Sep 28

ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1113.eqiad.wmnet']
Mon, Sep 28, 11:18 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1111.eqiad.wmnet']
Mon, Sep 28, 11:17 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1113.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009282303_robh_28131_an-worker1113_eqiad_wmnet.log.

Mon, Sep 28, 11:04 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1111.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009282256_robh_27287_an-worker1111_eqiad_wmnet.log.

Mon, Sep 28, 10:56 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1112.eqiad.wmnet']
Mon, Sep 28, 10:34 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['an-worker1111.eqiad.wmnet', 'an-worker1112.eqiad.wmnet', 'an-worker1113.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009282146_robh_14691.log.

Mon, Sep 28, 9:47 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1109.eqiad.wmnet', 'an-worker1108.eqiad.wmnet']
Mon, Sep 28, 9:35 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['an-worker1108.eqiad.wmnet', 'an-worker1109.eqiad.wmnet', 'an-worker1110.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009282109_robh_4742.log.

Mon, Sep 28, 9:09 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1105.eqiad.wmnet']
Mon, Sep 28, 9:05 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1106.eqiad.wmnet', 'an-worker1107.eqiad.wmnet']
Mon, Sep 28, 8:57 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1105.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009282039_robh_26679_an-worker1105_eqiad_wmnet.log.

Mon, Sep 28, 8:40 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['an-worker1105.eqiad.wmnet', 'an-worker1106.eqiad.wmnet', 'an-worker1107.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009282033_robh_24408.log.

Mon, Sep 28, 8:34 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1105.eqiad.wmnet']
Mon, Sep 28, 7:36 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1103.eqiad.wmnet']
Mon, Sep 28, 7:35 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1105.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281921_robh_11186_an-worker1105_eqiad_wmnet.log.

Mon, Sep 28, 7:21 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

['an-worker1103.eqiad.wmnet', 'an-worker1104.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009281901_robh_4446.log.

Mon, Sep 28, 7:02 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1102.eqiad.wmnet']
Mon, Sep 28, 6:35 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1102.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281803_robh_24912_an-worker1102_eqiad_wmnet.log.

Mon, Sep 28, 6:03 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Completed auto-reimage of hosts:

['db1150.eqiad.wmnet']
Mon, Sep 28, 5:42 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1150.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281719_cmjohnson_17565_db1150_eqiad_wmnet.log.

Mon, Sep 28, 5:20 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Completed auto-reimage of hosts:

['db1150.eqiad.wmnet']
Mon, Sep 28, 5:13 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Completed auto-reimage of hosts:

['an-worker1102.eqiad.wmnet']
Mon, Sep 28, 5:08 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1150.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281702_cmjohnson_13318_db1150_eqiad_wmnet.log.

Mon, Sep 28, 5:02 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Completed auto-reimage of hosts:

['db1150.eqiad.wmnet']
Mon, Sep 28, 4:59 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T259071: (Need By: TBD) rack/setup/install an-worker11[02-17].

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

an-worker1102.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281649_robh_11370_an-worker1102_eqiad_wmnet.log.

Mon, Sep 28, 4:49 PM · Analytics-Radar, ops-eqiad, DC-Ops, Operations
ops-monitoring-bot added a comment to T261512: Provision new RESTBase/Cassandra cluster nodes: restbase1028, restbase1029, restbase1030.

Completed auto-reimage of hosts:

['restbase1028.eqiad.wmnet', 'restbase1029.eqiad.wmnet', 'restbase1030.eqiad.wmnet']
Mon, Sep 28, 4:35 PM · RESTBase-Cassandra, Platform Engineering, Cassandra
ops-monitoring-bot added a comment to T261512: Provision new RESTBase/Cassandra cluster nodes: restbase1028, restbase1029, restbase1030.

Script wmf-auto-reimage was launched by hnowlan on cumin1001.eqiad.wmnet for hosts:

['restbase1028.eqiad.wmnet', 'restbase1029.eqiad.wmnet', 'restbase1030.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009281608_hnowlan_381.log.

Mon, Sep 28, 4:08 PM · RESTBase-Cassandra, Platform Engineering, Cassandra
ops-monitoring-bot added a comment to T260817: (Need By: 2020-09-15) rack/setup/install db1150 (see note on hostname).

Script wmf-auto-reimage was launched by cmjohnson on cumin1001.eqiad.wmnet for hosts:

db1150.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009281559_cmjohnson_30136_db1150_eqiad_wmnet.log.

Mon, Sep 28, 3:59 PM · Operations, DBA, ops-eqiad, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Completed auto-reimage of hosts:

['an-worker1099.eqiad.wmnet', 'an-worker1100.eqiad.wmnet']
Mon, Sep 28, 3:35 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

['an-worker1099.eqiad.wmnet', 'an-worker1100.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009281434_elukey_14594.log.

Mon, Sep 28, 2:34 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T263993: Decommission mendelevium.

cookbooks.sre.hosts.decommission executed by akosiaris@cumin1001 for hosts: mendelevium.eqiad.wmnet

  • mendelevium.eqiad.wmnet (WARN)
    • Failed downtime host on Icinga (likely already removed)
    • Found Ganeti VM
    • VM shutdown
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
    • VM removed
    • Started forced sync of VMs in Ganeti cluster ganeti01.svc.eqiad.wmnet to Netbox
Mon, Sep 28, 1:03 PM · OTRS, Operations, vm-requests
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Completed auto-reimage of hosts:

['stat1007.eqiad.wmnet']
Mon, Sep 28, 10:01 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T227485: Decommission analytics10[28-31,33-41].

cookbooks.sre.hosts.decommission executed by elukey@cumin1001 for hosts: analytics[1040-1041].eqiad.wmnet

  • analytics1040.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Mon, Sep 28, 9:00 AM · ops-eqiad, Analytics-Clusters, decommission-hardware, Operations
ops-monitoring-bot added a comment to T227485: Decommission analytics10[28-31,33-41].

cookbooks.sre.hosts.decommission executed by elukey@cumin1001 for hosts: analytics[1030-1031,1033-1039].eqiad.wmnet

  • analytics1030.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Mon, Sep 28, 8:53 AM · ops-eqiad, Analytics-Clusters, decommission-hardware, Operations
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Script wmf-auto-reimage was launched by klausman on cumin1001.eqiad.wmnet for hosts:

['stat1007.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009280844_klausman_784.log.

Mon, Sep 28, 8:44 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T227485: Decommission analytics10[28-31,33-41].

cookbooks.sre.hosts.decommission executed by elukey@cumin1001 for hosts: analytics[1028-1029].eqiad.wmnet

  • analytics1028.eqiad.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Mon, Sep 28, 8:42 AM · ops-eqiad, Analytics-Clusters, decommission-hardware, Operations

Fri, Sep 25

ops-monitoring-bot created T263837: Degraded RAID on es2026.
Fri, Sep 25, 9:44 AM · Patch-For-Review, DBA, Operations, ops-codfw
ops-monitoring-bot added a comment to T260670: db2125 crashed - mgmt iface also not available.

Completed auto-reimage of hosts:

['db2125.codfw.wmnet']
Fri, Sep 25, 9:31 AM · Patch-For-Review, User-Kormat, ops-codfw, DBA, Operations
ops-monitoring-bot added a comment to T260670: db2125 crashed - mgmt iface also not available.

Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts:

['db2125.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009250904_kormat_3140.log.

Fri, Sep 25, 9:04 AM · Patch-For-Review, User-Kormat, ops-codfw, DBA, Operations
ops-monitoring-bot added a comment to T260670: db2125 crashed - mgmt iface also not available.

Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts:

['db2125.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009250845_kormat_17901.log.

Fri, Sep 25, 8:46 AM · Patch-For-Review, User-Kormat, ops-codfw, DBA, Operations
ops-monitoring-bot added a comment to T260670: db2125 crashed - mgmt iface also not available.

Script wmf-auto-reimage was launched by kormat on cumin1001.eqiad.wmnet for hosts:

['db2125.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009250824_kormat_28716.log.

Fri, Sep 25, 8:24 AM · Patch-For-Review, User-Kormat, ops-codfw, DBA, Operations

Thu, Sep 24

ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Completed auto-reimage of hosts:

['mw1360.eqiad.wmnet']
Thu, Sep 24, 2:59 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

mw1360.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009241439_robh_30270_mw1360_eqiad_wmnet.log.

Thu, Sep 24, 2:39 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Completed auto-reimage of hosts:

['mw1360.eqiad.wmnet']
Thu, Sep 24, 2:38 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

mw1360.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009241438_robh_29643_mw1360_eqiad_wmnet.log.

Thu, Sep 24, 2:38 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T263615: decommission es2018.codfw.wmnet.

cookbooks.sre.hosts.decommission executed by volans@cumin1001 for hosts: es2018.codfw.wmnet

  • es2018.codfw.wmnet (FAIL)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Unable to connect to the host, wipe of bootloaders will not be performed: Cumin execution failed (exit_code=2)
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Thu, Sep 24, 8:15 AM · Operations, ops-codfw, decommission-hardware
ops-monitoring-bot added a comment to T263615: decommission es2018.codfw.wmnet.

cookbooks.sre.hosts.decommission executed by volans@cumin1001 for hosts: es2018.codfw.wmnet

  • es2018.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Thu, Sep 24, 8:08 AM · Operations, ops-codfw, decommission-hardware
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Completed auto-reimage of hosts:

['stat1006.eqiad.wmnet']
Thu, Sep 24, 7:48 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Script wmf-auto-reimage was launched by klausman on cumin1001.eqiad.wmnet for hosts:

['stat1006.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009240731_klausman_8593.log.

Thu, Sep 24, 7:32 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T263613: decommission es2012.codfw.wmnet.

cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: es2012.codfw.wmnet

  • es2012.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Thu, Sep 24, 5:41 AM · DC-Ops, Operations, ops-codfw, decommission-hardware

Wed, Sep 23

ops-monitoring-bot added a comment to T257903: decom wtp2005 (was: wtp2005 hardware issue).

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: wtp2005.codfw.wmnet

  • wtp2005.codfw.wmnet (FAIL)
    • Failed downtime host on Icinga (likely already removed)
    • Found physical host
    • Skipped downtime management interface on Icinga (likely already removed)
    • Unable to connect to the host, wipe of bootloaders will not be performed: Cumin execution failed (exit_code=2)
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Wed, Sep 23, 8:52 PM · serviceops, Operations, ops-codfw
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Completed auto-reimage of hosts:

['mw1360.eqiad.wmnet']
Wed, Sep 23, 4:27 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

mw1360.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009231616_robh_4519_mw1360_eqiad_wmnet.log.

Wed, Sep 23, 4:16 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Completed auto-reimage of hosts:

['mw1360.eqiad.wmnet']
Wed, Sep 23, 4:13 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262151: mw1360's NIC is faulty.

Script wmf-auto-reimage was launched by robh on cumin1001.eqiad.wmnet for hosts:

mw1360.eqiad.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009231613_robh_1862_mw1360_eqiad_wmnet.log.

Wed, Sep 23, 4:13 PM · Operations, serviceops, ops-eqiad
ops-monitoring-bot added a comment to T262889: decommission es2014.codfw.wmnet.

cookbooks.sre.hosts.decommission executed by marostegui@cumin1001 for hosts: es2014.codfw.wmnet

  • es2014.codfw.wmnet (PASS)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Wiped bootloaders
    • Powered off
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Wed, Sep 23, 5:33 AM · Operations, DC-Ops, ops-codfw, decommission-hardware

Tue, Sep 22

ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2010.codfw.wmnet']
Tue, Sep 22, 9:57 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2010.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009222131_pt1979_25100_maps2010_codfw_wmnet.log.

Tue, Sep 22, 9:31 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2010.codfw.wmnet']
Tue, Sep 22, 9:31 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2010.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009222116_pt1979_23003_maps2010_codfw_wmnet.log.

Tue, Sep 22, 9:16 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2009.codfw.wmnet']
Tue, Sep 22, 8:52 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2009.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009222029_pt1979_12848_maps2009_codfw_wmnet.log.

Tue, Sep 22, 8:29 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2008.codfw.wmnet']
Tue, Sep 22, 7:38 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2008.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009221907_pt1979_29719_maps2008_codfw_wmnet.log.

Tue, Sep 22, 7:07 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Completed auto-reimage of hosts:

['an-worker1101.eqiad.wmnet']
Tue, Sep 22, 6:05 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

['an-worker1101.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009221702_elukey_30132.log.

Tue, Sep 22, 5:02 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2007.codfw.wmnet']
Tue, Sep 22, 4:58 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2007.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009221654_pt1979_5446_maps2007_codfw_wmnet.log.

Tue, Sep 22, 4:54 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2007.codfw.wmnet']
Tue, Sep 22, 4:53 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Completed auto-reimage of hosts:

['an-worker1101.eqiad.wmnet']
Tue, Sep 22, 4:45 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2007.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009221637_pt1979_315_maps2007_codfw_wmnet.log.

Tue, Sep 22, 4:37 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2006.codfw.wmnet']
Tue, Sep 22, 4:24 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2006.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009221605_pt1979_26209_maps2006_codfw_wmnet.log.

Tue, Sep 22, 4:05 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2006.codfw.wmnet']
Tue, Sep 22, 3:54 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2006.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009221554_pt1979_24576_maps2006_codfw_wmnet.log.

Tue, Sep 22, 3:54 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T254892: (Due By: 2020-07-11) rack/setup/install an-worker[1096-1101].

Script wmf-auto-reimage was launched by elukey on cumin1001.eqiad.wmnet for hosts:

['an-worker1101.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009221541_elukey_30906.log.

Tue, Sep 22, 3:41 PM · Patch-For-Review, ops-eqiad, Operations, DC-Ops

Mon, Sep 21

ops-monitoring-bot added a comment to T263065: decom mw2256 (was: mw2256 - CPU/board hardware issue).

cookbooks.sre.hosts.decommission executed by dzahn@cumin1001 for hosts: mw2256.codfw.wmnet

  • mw2256.codfw.wmnet (FAIL)
    • Downtimed host on Icinga
    • Found physical host
    • Downtimed management interface on Icinga
    • Unable to connect to the host, wipe of bootloaders will not be performed: Cumin execution failed (exit_code=2)
    • Failed to power off, manual intervention required: Remote IPMI for mw2256.mgmt.codfw.wmnet failed (exit=1): b''
    • Set Netbox status to Decommissioning and deleted all non-mgmt interfaces and related IPs
    • Removed from DebMonitor
    • Removed from Puppet master and PuppetDB
Mon, Sep 21, 9:28 PM · ops-codfw, serviceops, Operations, DC-Ops
ops-monitoring-bot created T263484: Degraded RAID on ms-be2019.
Mon, Sep 21, 5:23 PM · Operations, ops-codfw

Fri, Sep 18

ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2005.codfw.wmnet']
Fri, Sep 18, 5:13 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2005.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009181709_pt1979_27191_maps2005_codfw_wmnet.log.

Fri, Sep 18, 5:09 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2005.codfw.wmnet']
Fri, Sep 18, 4:24 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2005.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009181603_pt1979_14734_maps2005_codfw_wmnet.log.

Fri, Sep 18, 4:03 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2005.codfw.wmnet']
Fri, Sep 18, 4:02 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2005.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009181545_pt1979_12312_maps2005_codfw_wmnet.log.

Fri, Sep 18, 3:46 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2005.codfw.wmnet']
Fri, Sep 18, 3:43 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2005.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009181529_pt1979_9410_maps2005_codfw_wmnet.log.

Fri, Sep 18, 3:30 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Completed auto-reimage of hosts:

['maps2005.codfw.wmnet']
Fri, Sep 18, 2:30 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T260271: (Need By: TBD) rack/setup/install maps20[05-10].codfw.wmnet.

Script wmf-auto-reimage was launched by pt1979 on cumin2001.codfw.wmnet for hosts:

maps2005.codfw.wmnet

The log can be found in /var/log/wmf-auto-reimage/202009181415_pt1979_25814_maps2005_codfw_wmnet.log.

Fri, Sep 18, 2:15 PM · Operations, Maps, ops-codfw, DC-Ops
ops-monitoring-bot added a comment to T263244: Reimage and reclone db2125.

Completed auto-reimage of hosts:

['db2125.codfw.wmnet']
Fri, Sep 18, 1:07 PM · User-Kormat, DBA
ops-monitoring-bot added a comment to T263244: Reimage and reclone db2125.

Script wmf-auto-reimage was launched by kormat on cumin2001.codfw.wmnet for hosts:

['db2125.codfw.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009181242_kormat_3990.log.

Fri, Sep 18, 12:43 PM · User-Kormat, DBA
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Completed auto-reimage of hosts:

['stat1004.eqiad.wmnet']
Fri, Sep 18, 9:21 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Script wmf-auto-reimage was launched by klausman on cumin1001.eqiad.wmnet for hosts:

['stat1004.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009180859_klausman_611.log.

Fri, Sep 18, 8:59 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters
ops-monitoring-bot added a comment to T255028: Move the stat1004-6-7 hosts to Debian Buster.

Script wmf-auto-reimage was launched by klausman on cumin1001.eqiad.wmnet for hosts:

['stat1004.eqiad.wmnet']

The log can be found in /var/log/wmf-auto-reimage/202009180821_klausman_29009.log.

Fri, Sep 18, 8:22 AM · Patch-For-Review, Analytics-Kanban, Analytics-Clusters