Page MenuHomePhabricator

wiki_willy
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Apr 16 2019, 9:00 PM (126 w, 3 d)
Availability
Available
LDAP User
Wpao
MediaWiki User
Unknown

Recent Activity

Wed, Sep 15

wiki_willy moved T290588: decommission maps2001.codfw.wmnet, maps2002.codfw.wmnet, maps2003.codfw.wmnet, maps2004.codfw.wmnet from Backlog to pending onsite steps (codfw) on the decommission-hardware board.
Wed, Sep 15, 9:41 PM · SRE, ops-codfw, Platform Team Workboards (Platform Engineering Reliability), decommission-hardware
wiki_willy reassigned T290588: decommission maps2001.codfw.wmnet, maps2002.codfw.wmnet, maps2003.codfw.wmnet, maps2004.codfw.wmnet from hnowlan to Papaul.
Wed, Sep 15, 9:40 PM · SRE, ops-codfw, Platform Team Workboards (Platform Engineering Reliability), decommission-hardware
wiki_willy added a comment to T290318: hw troubleshooting: megaraid reset due to fatal error for labstore1005.eqiad.wmnet.

Just a quick update - the replacement part was shipped out on Monday, and should be arriving today. (might be in the loading dock already)

Wed, Sep 15, 6:03 PM · SRE, ops-eqiad, DC-Ops

Tue, Sep 14

wiki_willy assigned T290708: mw2280 unresponsive to powercycle and hardreset to Papaul.

Just a heads up - Papaul is on paternity leave for a couple weeks, but let me know if this becomes urgent and we need to involve smart hands on anything. Thanks, Willy

Tue, Sep 14, 11:37 PM · SRE, ops-codfw

Mon, Sep 13

wiki_willy renamed T290899: Q1: eqiad: (32) PDUs for expansion from Q2: eqiad: (32) PDUs for expansion to Q1: eqiad: (32) PDUs for expansion.
Mon, Sep 13, 10:38 PM · SRE, ops-eqiad, DC-Ops

Wed, Sep 8

wiki_willy added a subtask for T290318: hw troubleshooting: megaraid reset due to fatal error for labstore1005.eqiad.wmnet: Unknown Object (Task).
Wed, Sep 8, 6:00 PM · SRE, ops-eqiad, DC-Ops

Tue, Sep 7

wiki_willy assigned T290416: Degraded RAID on ms-be1062 to Cmjohnson.

In warranty thru 2023-10-27

Tue, Sep 7, 10:10 PM · SRE, ops-eqiad
wiki_willy assigned T290442: Degraded RAID on ms-be1051 to Cmjohnson.

In warranty thru 2022-08-07

Tue, Sep 7, 10:09 PM · SRE, ops-eqiad
wiki_willy reassigned T289122: decommission pc1010.eqiad.wmnet from wiki_willy to Cmjohnson.
Tue, Sep 7, 3:57 PM · Patch-For-Review, SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy reassigned T289120: decommission pc1009.eqiad.wmnet from wiki_willy to Cmjohnson.
Tue, Sep 7, 3:57 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy reassigned T289119: decommission pc1008.eqiad.wmnet from wiki_willy to Cmjohnson.
Tue, Sep 7, 3:56 PM · Patch-For-Review, SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy reassigned T289118: decommission pc1007.eqiad.wmnet. from wiki_willy to Cmjohnson.
Tue, Sep 7, 3:56 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware

Fri, Sep 3

wiki_willy added a comment to T228919: Q1 '19:(Need by: 2020-06-30) replace scs-a8-eqiad.

After talking to John, the ETA to start on this is in a couple weeks - mid September

Fri, Sep 3, 11:09 PM · SRE, ops-eqiad
wiki_willy renamed T289624: Q1: (Need By: TBD) rack/setup/install centrallog2002.codfw.wmnet from (Need By: TBD) rack/setup/install centrallog2002.codfw.wmnet to Q1: (Need By: TBD) rack/setup/install centrallog2002.codfw.wmnet.
Fri, Sep 3, 11:03 PM · SRE, observability, SRE Observability (FY2021/2022-Q1), ops-codfw, DC-Ops
wiki_willy renamed T289733: Q1: (Need By: TBD) rack/setup/install puppetmaster200[45].codfw.wmnet from (Need By: TBD) rack/setup/install puppetmaster200[45].codfw.wmnet to Q1: (Need By: TBD) rack/setup/install puppetmaster200[45].codfw.wmnet.
Fri, Sep 3, 11:02 PM · SRE, Infrastructure-Foundations, ops-codfw, DC-Ops
wiki_willy closed T243450: Audit & update spares part tracking for all sites as Resolved.

Resolving this task. After talking to Chris, we'll update the eqiad inventory after the next recycling pickup in a couple months. Thanks, Willy

Fri, Sep 3, 10:45 PM · ops-eqiad, ops-codfw, DC-Ops, SRE
wiki_willy added a subtask for T283483: Various netbox alerts running for days: T290364: Netbox Errors in eqiad.
Fri, Sep 3, 10:39 PM · DC-Ops
wiki_willy added a parent task for T290364: Netbox Errors in eqiad: T283483: Various netbox alerts running for days.
Fri, Sep 3, 10:39 PM · SRE, ops-eqiad, DC-Ops
wiki_willy created T290364: Netbox Errors in eqiad.
Fri, Sep 3, 10:39 PM · SRE, ops-eqiad, DC-Ops
wiki_willy added a parent task for T290362: Netbox Errors in codfw: T283483: Various netbox alerts running for days.
Fri, Sep 3, 10:30 PM · SRE, ops-codfw, DC-Ops
wiki_willy added a subtask for T283483: Various netbox alerts running for days: T290362: Netbox Errors in codfw.
Fri, Sep 3, 10:30 PM · DC-Ops
wiki_willy created T290362: Netbox Errors in codfw.
Fri, Sep 3, 10:29 PM · SRE, ops-codfw, DC-Ops

Thu, Sep 2

wiki_willy updated subscribers of T289657: Decommission mc[1019-1023,1025-1026,1028-1036].eqiad.wmnet.

Awesome, thanks so much @jijiki. (fyi for @Cmjohnson and @Jclark-ctr)

Thu, Sep 2, 4:32 PM · SRE, ops-eqiad, decommission-hardware

Wed, Sep 1

wiki_willy added a comment to T289657: Decommission mc[1019-1023,1025-1026,1028-1036].eqiad.wmnet.

Awesome, thanks @jijiki!

Wed, Sep 1, 6:56 PM · SRE, ops-eqiad, decommission-hardware

Mon, Aug 30

wiki_willy assigned T185337: rack spare switches in c1-eqiad to Cmjohnson.
Mon, Aug 30, 11:18 PM · Infrastructure-Foundations, SRE, netops, ops-eqiad
wiki_willy assigned T289755: cloudcephosd1014.mgmt reported down by icinga to Cmjohnson.
Mon, Aug 30, 11:16 PM · SRE, cloud-services-team (Kanban), ops-eqiad

Fri, Aug 27

wiki_willy renamed T289882: Q1:(Need By: TBD) rack/setup/install cloudswift100[12] from (Need By: TBD) rack/setup/install cloudswift100[12] to Q1:(Need By: TBD) rack/setup/install cloudswift100[12].
Fri, Aug 27, 7:05 PM · SRE, Infrastructure-Foundations, ops-eqiad, netops, cloud-services-team (Hardware), DC-Ops

Thu, Aug 26

wiki_willy renamed T228919: Q1 '19:(Need by: 2020-06-30) replace scs-a8-eqiad from Q1:(Need by: 2020-06-30) replace scs-a8-eqiad to Q1 '19:(Need by: 2020-06-30) replace scs-a8-eqiad.
Thu, Aug 26, 10:30 PM · SRE, ops-eqiad
wiki_willy renamed T284471: Q4:(Need By: TBD) rack/setup/install cloudcephosd102[1-4].eqiad.wmnet from Q1:(Need By: TBD) rack/setup/install cloudcephosd102[1-4].eqiad.wmnet to Q4:(Need By: TBD) rack/setup/install cloudcephosd102[1-4].eqiad.wmnet.
Thu, Aug 26, 10:29 PM · SRE, ops-eqiad, DC-Ops
wiki_willy renamed T281989: Q4:(Need By: TBD) rack/setup/install elastic10[68-83].eqiad.wmnet from Q1:(Need By: TBD) rack/setup/install elastic10[68-83].eqiad.wmnet to Q4:(Need By: TBD) rack/setup/install elastic10[68-83].eqiad.wmnet.
Thu, Aug 26, 10:28 PM · Discovery-Search (Current work), SRE, Elasticsearch, ops-eqiad, DC-Ops

Wed, Aug 25

wiki_willy added a comment to T275696: reclaim cescout1001.eqiad.wmnet.

Much appreciated @ssingh, thanks!

Wed, Aug 25, 6:33 PM · DC-Ops, ops-eqiad, SRE, Traffic, decommission-hardware
wiki_willy added a comment to T285808: Q1:(Need By: ASAP) rack/setup/install ms-be10[64-67].

Just a quick summary of what Chris and I went over:

Wed, Aug 25, 6:28 PM · SRE, ops-eqiad, DC-Ops
wiki_willy updated subscribers of T289657: Decommission mc[1019-1023,1025-1026,1028-1036].eqiad.wmnet.

Hi @jijiki - hope all is well. We were wondering if it would be possible to prioritize the decom of mc1033 and 1034? It would help us with installing T285808 for @fgiunchedi's ms-be hosts. Thanks, Willy

Wed, Aug 25, 6:25 PM · SRE, ops-eqiad, decommission-hardware
wiki_willy assigned T272074: decommission mc1024 to Cmjohnson.

Hi @Dzahn - just a quick reminder to add the "ops-eqiad" project tag when the servers are ready for dc-ops to unrack. Much appreciated. Thanks, Willy

Wed, Aug 25, 6:21 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy assigned T282025: decommission eventlog1002.eqiad.wmnet to Cmjohnson.
Wed, Aug 25, 6:17 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy reassigned T275696: reclaim cescout1001.eqiad.wmnet from Jclark-ctr to Cmjohnson.

Hi @ssingh - just a heads up to add the "ops-eqiad" project tag, when its ready for the dc-ops steps. Thanks, Willy

Wed, Aug 25, 5:53 PM · DC-Ops, ops-eqiad, SRE, Traffic, decommission-hardware
wiki_willy moved T279601: decommission icinga1001.wikimedia.org from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Wed, Aug 25, 5:52 PM · SRE, DC-Ops, ops-eqiad, SRE Observability (FY2021/2022-Q1), decommission-hardware
wiki_willy moved T282078: decommission snapshot100[5,6,7].eqiad.wmnet from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Wed, Aug 25, 5:51 PM · SRE, DC-Ops, ops-eqiad, Dumps-Generation, decommission-hardware
wiki_willy moved T282575: decommission mwlog1001 from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Wed, Aug 25, 5:51 PM · DC-Ops, ops-eqiad, Patch-For-Review, SRE, observability, decommission-hardware
wiki_willy moved T283507: decommission logstash102[012] from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Wed, Aug 25, 5:50 PM · SRE, DC-Ops, ops-eqiad, observability, decommission-hardware
wiki_willy added projects to T288744: decommission druid1002.eqiad.wmnet: ops-eqiad, DC-Ops.

Hi @BTullis - just following up. to see if we can proceed with the dc-ops steps, since "remove all remaining puppet references and all host entries in the puppet repo" looks like it hasn't been done yet. Much appreciated. Thanks, Willy

Wed, Aug 25, 5:49 PM · SRE, DC-Ops, ops-eqiad, decommission-hardware
wiki_willy moved T289339: decommission druid1001.eqiad.wmnet from Backlog to pending onsite steps (eqiad) on the decommission-hardware board.
Wed, Aug 25, 5:32 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy reassigned T288579: decommission bast4002.wikimedia.org from Jclark-ctr to RobH.

Hi @ssingh - just a heads up to add "ops-ulsfo" as a project tag, when this is ready for dc-ops to unrack. Thanks, Willy

Wed, Aug 25, 5:31 PM · DC-Ops, ops-ulsfo, SRE, Traffic, decommission-hardware
wiki_willy added projects to T289339: decommission druid1001.eqiad.wmnet: DC-Ops, ops-eqiad.

Hi @BTullis - just a heads up to add "ops-eqiad" as a project tag, when this is ready for dc-ops to unrack. Thanks, Willy

Wed, Aug 25, 5:29 PM · SRE, ops-eqiad, DC-Ops, decommission-hardware
wiki_willy added projects to T282575: decommission mwlog1001: ops-eqiad, DC-Ops.

Hi @herron - just a heads up to add "ops-eqiad" as a project tag, when this is ready for dc-ops to unrack. Thanks, Willy

Wed, Aug 25, 5:29 PM · DC-Ops, ops-eqiad, Patch-For-Review, SRE, observability, decommission-hardware
wiki_willy added projects to T279601: decommission icinga1001.wikimedia.org: ops-eqiad, DC-Ops.

Hi @colewhite - just a heads up to add "ops-eqiad" as a project task, when this is ready for dc-ops to unrack. Much appreciated! Thanks, Willy

Wed, Aug 25, 5:24 PM · SRE, DC-Ops, ops-eqiad, SRE Observability (FY2021/2022-Q1), decommission-hardware
wiki_willy added projects to T282078: decommission snapshot100[5,6,7].eqiad.wmnet: ops-eqiad, DC-Ops.

Hi @ArielGlenn - just a heads up to add "ops-eqiad" as a project task, when this is ready for dc-ops to unrack. Much appreciated! Thanks, Willy

Wed, Aug 25, 5:23 PM · SRE, DC-Ops, ops-eqiad, Dumps-Generation, decommission-hardware
wiki_willy added projects to T283507: decommission logstash102[012]: ops-eqiad, DC-Ops.

Hi @herron - just a heads up to add "ops-eqiad" as a project task, when this is ready for dc-ops to unrack. Much appreciated! Thanks, Willy

Wed, Aug 25, 5:21 PM · SRE, DC-Ops, ops-eqiad, observability, decommission-hardware

Aug 17 2021

wiki_willy added a comment to T288586: codfw: Netbox Error.

Got it, thanks for the info @Papaul !

Aug 17 2021, 5:42 PM · SRE, ops-codfw, DC-Ops

Aug 10 2021

wiki_willy updated the task description for T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.
Aug 10 2021, 11:24 PM · SRE, ops-eqiad, DC-Ops
wiki_willy created T288586: codfw: Netbox Error.
Aug 10 2021, 11:23 PM · SRE, ops-codfw, DC-Ops

Aug 6 2021

wiki_willy added a comment to T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.

Hi @Jclark-ctr & @Papaul - just a heads up, if it's a linecard or something else that doesn't get an asset tag, you can just set the "AssetID" to "NA" and the "Asset Tag#" to "WMFNA" on the Accounting Spreadsheet, and the Netbox error will go away. I did this for lines 1482, 1482, 1493, and 1494...so now those MPCs are no longer alerting in Netbox. Thanks, Willy

Aug 6 2021, 5:13 PM · SRE, ops-eqiad, DC-Ops

Aug 3 2021

wiki_willy reassigned T286497: hw troubleshooting: Disk failure for elastic1039.eqiad.wmnet from wiki_willy to RKemper.

Hi @RKemper - since elastic1039 is currently at the 5yr mark, and we're currently installing the refreshes via T281989, are you ok if we just decom this host instead?

Aug 3 2021, 5:46 PM · SRE, Discovery-Search (Current work), ops-eqiad, DC-Ops

Aug 2 2021

wiki_willy added a subtask for T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301): T276743: ps1-a7-eqiad power over threshold alerts.
Aug 2 2021, 11:10 PM · ops-eqiad, decommission-hardware, Patch-For-Review, SRE, serviceops
wiki_willy added a parent task for T276743: ps1-a7-eqiad power over threshold alerts: T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301).
Aug 2 2021, 11:10 PM · SRE, ops-eqiad, DC-Ops
wiki_willy added a subtask for T277340: (Need By: TBD) rack/setup/install (2) new 10G switches: T280977: Rack/power audit in eqiad c8/d5.
Aug 2 2021, 11:08 PM · Patch-For-Review, SRE, ops-eqiad, DC-Ops
wiki_willy added a parent task for T280977: Rack/power audit in eqiad c8/d5: T277340: (Need By: TBD) rack/setup/install (2) new 10G switches.
Aug 2 2021, 11:08 PM · SRE, ops-eqiad

Jul 28 2021

wiki_willy added a comment to T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301).

@wiki_willy You guys can remove all old mw appservers from eqiad rack A5 and rack A8 already, they are decom'ed:

rack A5 mw1261 through mw1266

rack A8 mw1267, mw1268, mw1280 through mw1283

26 servers are in status "decommissioning" now in the "mw12" range:

https://netbox.wikimedia.org/dcim/devices/?q=mw12&status=decommissioning&mac_address=&has_primary_ip=&local_context_data=&virtual_chassis_member=&console_ports=&console_server_ports=&power_ports=&power_outlets=&interfaces=&pass_through_ports=&cf_purchase_date=

Jul 28 2021, 5:45 PM · ops-eqiad, decommission-hardware, Patch-For-Review, SRE, serviceops

Jul 26 2021

wiki_willy updated subscribers of T284614: Netbox: define strategy to track standard server configurations.

Thanks @Volans, I think we're good. If we're all set now, @Papaul, @Jclark-ctr, @Cmjohnson, and @RobH can proceed with using the new naming device type conventions. Thanks, Willy

Jul 26 2021, 10:43 PM · Infrastructure-Foundations, netbox

Jul 23 2021

wiki_willy reassigned T286468: Relabel dbstore1004 to db1183 from wiki_willy to Jclark-ctr.
Jul 23 2021, 4:43 PM · SRE, ops-eqiad, DC-Ops

Jul 22 2021

wiki_willy added a comment to T287180: msw-c7-eqiad down.

Reached out to John, who's heading over to the cage right now, to check it out. Thanks, Willy

Jul 22 2021, 4:10 PM · SRE, ops-eqiad
wiki_willy assigned T287137: Degraded RAID on db1175 to Jclark-ctr.
Jul 22 2021, 3:04 PM · DBA, ops-eqiad

Jul 21 2021

wiki_willy assigned T286763: Broken RAM on db1127 to Jclark-ctr.
Jul 21 2021, 11:01 PM · DBA, ops-eqiad, SRE
wiki_willy reassigned T285715: Degraded RAID on db1129 from Cmjohnson to Jclark-ctr.
Jul 21 2021, 4:41 PM · DBA, SRE, ops-eqiad

Jul 20 2021

wiki_willy reassigned T286888: db1170 mysql process crashed from Cmjohnson to Jclark-ctr.

Hi @Kormat - Chris is out this week, so moving over to @Jclark-ctr for him to check out this machine. (under warranty thru Nov 2023) Thanks, Willy

Jul 20 2021, 6:54 PM · SRE, DC-Ops, ops-eqiad, DBA

Jul 19 2021

wiki_willy assigned T286942: decommission payments1001.frack.eqiad.wmnet to Cmjohnson.
Jul 19 2021, 11:13 PM · SRE, ops-eqiad, decommission-hardware
wiki_willy assigned T286943: decommission payments1002.frack.eqiad.wmnet to Cmjohnson.
Jul 19 2021, 11:13 PM · SRE, ops-eqiad, decommission-hardware
wiki_willy assigned T286944: decommission payments1003.frack.eqiad.wmnet to Cmjohnson.
Jul 19 2021, 11:13 PM · SRE, ops-eqiad, decommission-hardware
wiki_willy assigned T286945: decommission payments1004.frack.eqiad.wmnet to Cmjohnson.
Jul 19 2021, 11:12 PM · SRE, ops-eqiad, decommission-hardware
wiki_willy reassigned T282484: (Need By: TBD) rack/setup/install pc1011-pc1014 from Cmjohnson to Jclark-ctr.

Moving over to @Jclark-ctr to check the network on pc1014. Thanks, Willy

Jul 19 2021, 11:11 PM · Data-Persistence, SRE, ops-eqiad, DC-Ops

Jul 16 2021

wiki_willy reassigned T276922: cloudvirt1038: PCIe error from Cmjohnson to Jclark-ctr.

Hi @Jclark-ctr - it looks like Chris going to be out for a while. Dell has one last suggestion in figuring out a solution for this ticket, while the replacement server is being ordered. With the server lead time delays, it will be months before arriving. So, when you're back next Tue/Wed, can you with the Dell Support team (I'm asking them to reach out to you) on implementing their proposed solution? Thanks, Willy

Jul 16 2021, 6:24 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)
wiki_willy reassigned T277340: (Need By: TBD) rack/setup/install (2) new 10G switches from Cmjohnson to Jclark-ctr.

It looks like Chris is going to be out for a while. @Jclark-ctr - can you prioritize this one, when you're back next week? Rob has ordered the DAC cables, so they should be arriving soon. Thanks, Willy

Jul 16 2021, 6:10 PM · Patch-For-Review, SRE, ops-eqiad, DC-Ops
wiki_willy reassigned T286226: Upgrade db1104 firmware from Cmjohnson to Jclark-ctr.

It looks like Chris is going to be out for a while. Moving this task over to @Jclark-ctr, who should be back Tuesday or Wednesday. Thanks, Willy

Jul 16 2021, 6:09 PM · SRE, ops-eqiad, DBA

Jul 14 2021

wiki_willy updated the task description for T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.
Jul 14 2021, 9:47 PM · SRE, ops-eqiad, DC-Ops
wiki_willy updated the task description for T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.
Jul 14 2021, 9:47 PM · SRE, ops-eqiad, DC-Ops
wiki_willy added a comment to T276922: cloudvirt1038: PCIe error.

Ok, thanks @nskaggs. They're currently processing a server replacement. Simultaneously, especially with the long lead times for new servers, there's one more suggestion that the Dell Support team has in troubleshooting for a perm fix. @Cmjohnson - they'll reach out to you early next week when you're back onsite, to try it out. Thanks, Willy

Jul 14 2021, 8:31 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)
wiki_willy reassigned T286226: Upgrade db1104 firmware from LSobanski to Cmjohnson.

Moving over to @Cmjohnson, who will be back before John next week. Thanks, Willy

Jul 14 2021, 3:13 PM · SRE, ops-eqiad, DBA

Jul 13 2021

wiki_willy assigned T286274: mgmt on logstash2021 inaccessible to Papaul.
Jul 13 2021, 11:04 PM · ops-codfw, SRE

Jul 1 2021

wiki_willy added a comment to T276922: cloudvirt1038: PCIe error.

Just got off the phone with Dell. It's escalated on their side, and they're going to sync up tomorrow in figuring out a solution for this, which could very well end up being a new replacement server. Since this server seems to work fine when the PCI card slot 1 is disabled though (and the PCI card doesn't seem to be used for anything), would WMCS be ok if we just left the bios settings that way? Thanks, Willy

Jul 1 2021, 9:51 PM · SRE, ops-eqiad, DC-Ops, cloud-services-team (Hardware)
wiki_willy added a comment to T284614: Netbox: define strategy to track standard server configurations.

Hey Riccardo - for any in-year modifications to the hardware specs, I was thinking of calling it rev1, rev2, etc...so something like "ConfigA-rev1." Initially, I was thinking each rev would represent the same exact config, since spec changes within a fiscal year shouldn't happen very often. But if you think logically similar would be better, we could go with that as well. My preference would be to have all the info (config, fiscal year, server model) in one single field. I think the main benefit here (other than for reporting type stuff), would be having the ability to search in Netbox quickly in finding an interchangeable server. For example, if we need to quickly find a replacement part for an out of warranty server, we could do a Netbox search based on this single field, to find get a list of which decom'd servers have the same config. Or if we ever run into a case where we need to add servers in some type of emergency, we can pull this data from Netbox to easily figure out which ones would qualify. Hope this helps, but let me know and I can hop on during your next office hours as well.

Jul 1 2021, 12:05 AM · Infrastructure-Foundations, netbox

Jun 30 2021

wiki_willy added a comment to T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301).

Thanks @Legoktm, much appreciated!

Jun 30 2021, 2:30 PM · ops-eqiad, decommission-hardware, Patch-For-Review, SRE, serviceops

Jun 29 2021

wiki_willy assigned T285799: Degraded RAID on cloudcephosd1018 to Cmjohnson.
Jun 29 2021, 10:41 PM · cloud-services-team (Kanban), SRE, ops-eqiad
wiki_willy assigned T285664: Disk failed on thanos-be1003 to Cmjohnson.
Jun 29 2021, 10:41 PM · User-fgiunchedi, SRE, ops-eqiad
wiki_willy assigned T285643: Degraded RAID on elastic1039 to Cmjohnson.
Jun 29 2021, 10:40 PM · Discovery-Search (Current work), Discovery, SRE, ops-eqiad

Jun 28 2021

wiki_willy renamed T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies from Netbox Duplicate Cable IDs to Netbox Duplicate Cable IDs & Accounting Discrepancies.
Jun 28 2021, 11:02 PM · SRE, ops-eqiad, DC-Ops
wiki_willy closed T285718: Netbox Accounting Errors as Resolved.

Actually, I'm going to keep it in the same place on the spreadsheet, then just mark WMFNA, which should fix it. Thanks, Willy

Jun 28 2021, 10:57 PM · SRE, ops-codfw, DC-Ops
wiki_willy closed T285718: Netbox Accounting Errors, a subtask of T283483: Various netbox alerts running for days, as Resolved.
Jun 28 2021, 10:56 PM · DC-Ops
wiki_willy reassigned T285718: Netbox Accounting Errors from wiki_willy to Papaul.

Thanks Papaul. You can just move the line cards on the accounting spreadsheet to the top section called "Variance Between Netbox and Asset Tag List (Items Tracked but Did Not Meet Capitalization Threshold)" That will remove the Netbox alerts, and then you can just resolve the task. Much appreciated.

Jun 28 2021, 10:46 PM · SRE, ops-codfw, DC-Ops
wiki_willy added a subtask for T283483: Various netbox alerts running for days: T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.
Jun 28 2021, 8:47 PM · DC-Ops
wiki_willy added a parent task for T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies: T283483: Various netbox alerts running for days.
Jun 28 2021, 8:47 PM · SRE, ops-eqiad, DC-Ops
wiki_willy created T285719: Netbox Duplicate Cable IDs & Accounting Discrepancies.
Jun 28 2021, 8:47 PM · SRE, ops-eqiad, DC-Ops
wiki_willy added a subtask for T283483: Various netbox alerts running for days: T285718: Netbox Accounting Errors.
Jun 28 2021, 8:44 PM · DC-Ops
wiki_willy added a parent task for T285718: Netbox Accounting Errors: T283483: Various netbox alerts running for days.
Jun 28 2021, 8:44 PM · SRE, ops-codfw, DC-Ops
wiki_willy created T285718: Netbox Accounting Errors.
Jun 28 2021, 8:44 PM · SRE, ops-codfw, DC-Ops
wiki_willy added a comment to T285715: Degraded RAID on db1129.

Hi @Cmjohnson - just a heads up, there's only a couple more months before the warranty expires on this host. Thanks, Willy

Jun 28 2021, 8:37 PM · DBA, SRE, ops-eqiad
wiki_willy assigned T285715: Degraded RAID on db1129 to Cmjohnson.
Jun 28 2021, 8:35 PM · DBA, SRE, ops-eqiad

Jun 25 2021

wiki_willy updated subscribers of T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301).
Jun 25 2021, 7:02 PM · ops-eqiad, decommission-hardware, Patch-For-Review, SRE, serviceops
wiki_willy added a project to T280203: decom 44 eqiad appservers purchased on 2016-04-12/13 (mw1261 through mw1301): decommission-hardware.
Jun 25 2021, 7:01 PM · ops-eqiad, decommission-hardware, Patch-For-Review, SRE, serviceops

Jun 24 2021

wiki_willy added a comment to T284614: Netbox: define strategy to track standard server configurations.

Hey @Papaul - it would be pretty rare for us to change the specs on each config mid-fiscal year. I'm only planning on revising it at the beginning of each fiscal. If something unexpected did happen between Sept and Dec 2021 though, and let's Dell changed their specs mid-year....I was thinking we could add something like "ConfigA-rev2 FY21-22 (PowerEdge 440)" or something similar to whichever format we decide to go with.

Jun 24 2021, 9:11 PM · Infrastructure-Foundations, netbox
wiki_willy added a comment to T284614: Netbox: define strategy to track standard server configurations.

Thanks for working on this guys. In my opinion, I think I like Papaul's format a little bit better. But if I were to take the best things I like from both Riccardo and Papaul's format (and Cathal's comment), it would probably look something like this:

Jun 24 2021, 7:56 PM · Infrastructure-Foundations, netbox