Feed Advanced Search

Advanced Search
Use Results
Edit Query
Hide Query

	Include stories about projects I am a member of.

Apr 11 2019

faidon added a comment to T220422: Netbox Reports: General Cleanup and Improvement.

OK, so, after the efforts in the past few days, we're in a much better shape! The PuppetDB report seems to be (almost?) entirely indicative of real issues and is actionable now - I will involve DC Ops to start fixing the cases that are known to be real errors, and we'll see if there are any false positives (I know of at least one, that is tough to handle!).

Apr 11 2019, 9:27 PM · netbox, Patch-For-Review, User-crusnov, DC-Ops, SRE-tools

faidon updated the task description for T220422: Netbox Reports: General Cleanup and Improvement.

Apr 11 2019, 9:07 PM · netbox, Patch-For-Review, User-crusnov, DC-Ops, SRE-tools

Apr 9 2019

faidon updated subscribers of T214903: labsdb1002-array1: status clarification.

@RobH and @Cmjohnson, is this a forgotten decom?

Apr 9 2019, 1:38 PM · SRE, ops-eqiad, decommission-hardware, cloud-services-team (Kanban)

faidon added a project to T214903: labsdb1002-array1: status clarification: decommission-hardware.

Apr 9 2019, 1:38 PM · SRE, ops-eqiad, decommission-hardware, cloud-services-team (Kanban)

faidon added a comment to T214181: codfw: rename/relabel labtestneutron2001 to cloudnet2001-dev.

Given T218025, can we resolve this?

Apr 9 2019, 1:10 PM · SRE, DC-Ops, ops-codfw

faidon added a comment to T202966: Make cp1099 the new pinkunicorn.

According to Netbox, cp1099 is 2 years newer than cp1008, but is still a 6-year old server (purchased Mar 28, 2013). Can we just get rid of it? I'm concerned we're just spending cycles on a box that may die any day now and that we won't be able to repair...

Apr 9 2019, 12:54 PM · SRE, Traffic

Apr 8 2019

faidon added a comment to T209707: tagged_interface sometimes exceeds IFNAMSIZ.

I think this is addressed by systemd's 9009d3b5c3b6d191be69215736be77583e0f23f9, included in v239 (stretch has v232, buster has v241).

Apr 8 2019, 11:08 PM · Traffic, SRE

Mar 21 2019

Mill <mill@mail.com> committed rOSKEYHOLDER9fb7d69208e6: pyaaaaaaaaaaaa (authored by faidon).

pyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDERecc54f53f151: )yaaaaaaaaaaaa (authored by faidon).

)yaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER4688af2fc102: uyaaaaaaaaaaaa (authored by faidon).

uyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER97de3d4dad7c: yyaaaaaaaaaaaa (authored by faidon).

yyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDERa588fd6bfc05: vyaaaaaaaaaaaa (authored by faidon).

vyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER50927819b02d: tyaaaaaaaaaaaa (authored by faidon).

tyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER48048fa41119: xyaaaaaaaaaaaa (authored by faidon).

xyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER21db23b59d4d: ryaaaaaaaaaaaa (authored by faidon).

ryaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER6a68d2ba2a5e: wyaaaaaaaaaaaa (authored by faidon).

wyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER90fb5301b369: 0yaaaaaaaaaaaa (authored by faidon).

0yaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDERaa816fdf9682: qyaaaaaaaaaaaa (authored by faidon).

qyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDER668563582e69: zyaaaaaaaaaaaa (authored by faidon).

zyaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mill <mill@mail.com> committed rOSKEYHOLDEReb7bd673b43c: syaaaaaaaaaaaa (authored by faidon).

syaaaaaaaaaaaa

Mar 21 2019, 12:41 AM

Mar 7 2019

faidon added a comment to T214183: Setup graphs for power usage readings in Grafana.

For the per-site usage, LibreNMS besides being clunky, is non-public and not accessible to all.

Mar 7 2019, 2:53 PM · DC-Ops, observability

faidon triaged T214183: Setup graphs for power usage readings in Grafana as High priority.

Mar 7 2019, 1:20 PM · DC-Ops, observability

faidon added a comment to T217686: Document service owner in Netbox.

This seems like a duplicate (and subset of) T216088. I've added the custom field proposal as one of the many options listed in its task description and closing this as duplicate to keep the discussion in one place :)

Mar 7 2019, 12:22 PM · SRE

faidon updated the task description for T216088: Mapping of servers to stakeholders.

Mar 7 2019, 12:21 PM · Infrastructure-Foundations, Patch-For-Review, User-jbond

faidon merged T217686: Document service owner in Netbox into T216088: Mapping of servers to stakeholders.

Mar 7 2019, 12:19 PM · Infrastructure-Foundations, Patch-For-Review, User-jbond

faidon merged task T217686: Document service owner in Netbox into T216088: Mapping of servers to stakeholders.

Mar 7 2019, 12:19 PM · SRE

Mar 5 2019

Dvorapa awarded T191764: CI: run tests with multiple Python3 versions a Love token.

Mar 5 2019, 9:22 AM · Patch-For-Review, User-ArielGlenn, Continuous-Integration-Infrastructure

Mar 4 2019

faidon raised the priority of T212010: Degraded RAID on sodium from Medium to High.

I just merged a duplicate in. @Cmjohnson what's the status of this?

Mar 4 2019, 1:03 PM · ops-eqiad, SRE

faidon merged T217356: Degraded RAID on sodium into T212010: Degraded RAID on sodium.

Mar 4 2019, 1:02 PM · ops-eqiad, SRE

faidon merged task T217356: Degraded RAID on sodium into T212010: Degraded RAID on sodium.

Mar 4 2019, 1:02 PM · ops-eqiad, SRE

faidon reopened T122144: Move most (all?) exim personal aliases to WMF ITS as "Open".

In T122144#4152079, @Dzahn wrote:

or they are individual aliases (out of scope of this ticket)

Mar 4 2019, 12:00 PM · Infrastructure-Foundations, Epic, Mail, SRE

Feb 20 2019

Krinkle awarded T122144: Move most (all?) exim personal aliases to WMF ITS a Orange Medal token.

Feb 20 2019, 12:16 AM · Infrastructure-Foundations, Epic, Mail, SRE

Feb 15 2019

faidon reassigned T215837: eqiad: requesting dual cpu misc host for icinga1001 replacement from faidon to RobH.

If that's still needed, that's approved, and it takes priority over phab1002. And let's replenish our spare pool indeed!

Feb 15 2019, 4:11 PM · SRE, hardware-requests

faidon reassigned T215335: requesting WMF7426 as phabricator system in eqiad from faidon to RobH.

@Dzahn that's all fine, but we should have that documented in a separate Phabricator task tracking this work, if one doesn't exist already :) Separately, I'd also really love having a permanent non-SPOF setup in each data center as well, whether that's multiple bare metal servers, multiple VMs or running Phabricator on k8s. This is too important of a service to run in one misc-type server per site.

Feb 15 2019, 4:10 PM · serviceops, SRE, hardware-requests

Feb 14 2019

faidon added a comment to T216133: Increase visibility of auto-generated tasks for RAID errors.

We discussed this a little bit yesterday, and T216088 was filed to further discuss this. Help there is welcome :)

Feb 14 2019, 3:27 PM · Sustainability (Incident Followup), cloud-services-team (Kanban), SRE, DC-Ops

faidon added a comment to T205897: Netbox: fill network topology.

The medium-term plan is for this data to be entered into Netbox after a server is racked but before it's provisioned or even powered up, and that data to be used by our tooling to configure and execute the provisioning itself (DHCP configuration, switchport, OS install etc.).

Feb 14 2019, 11:21 AM · Infrastructure-Foundations, netbox, SRE

Feb 12 2019

faidon added a comment to T196507: Degraded RAID on cloudvirt1019.

Before these are delivered for implementation, let's make sure that the two systems have identical settings, especially given we've tested various things on them over the past few months. I reverted my SSD Smart Path setting on 1019, but there are still differences; the most important one that I noticed is that in cloudvirt1019 the P440ar is hidden (disabled in BIOS?) but in cloudvirt1020 it's visible. Maybe a factory reset and then manually reapplying the same settings in each?

Feb 12 2019, 10:47 PM · Patch-For-Review, cloud-services-team (Kanban), ops-eqiad, SRE

Feb 7 2019

CDanis awarded T126989: MediaWiki logging & encryption a Love token.

Feb 7 2019, 1:41 PM · Observability-Logging, observability, MW-1.33-notes (1.33.0-wmf.24; 2019-04-02), Patch-For-Review, Wikimedia-Logstash, MediaWiki-Debug-Logger, SRE

CDanis awarded T126989: MediaWiki logging & encryption a Love token.

Feb 7 2019, 1:41 PM · Observability-Logging, observability, MW-1.33-notes (1.33.0-wmf.24; 2019-04-02), Patch-For-Review, Wikimedia-Logstash, MediaWiki-Debug-Logger, SRE

Feb 5 2019

faidon added a comment to T215335: requesting WMF7426 as phabricator system in eqiad.

Is there a task describing the plans for a secondary Phabricator system? How did we come up with those specs?

Feb 5 2019, 10:50 PM · serviceops, SRE, hardware-requests

Feb 2 2019

faidon changed the status of T214130: Requesting access to production for dsharpe from Stalled to Open.

Feb 2 2019, 9:21 PM · SRE, SRE-Access-Requests

faidon changed the status of T214130: Requesting access to production for dsharpe, a subtask of T213742: Onboarding David Sharpe to Security Team as Information Security Analyst, from Stalled to Open.

Feb 2 2019, 9:21 PM · Security-Team

faidon added a comment to T214130: Requesting access to production for dsharpe.

Let's not wait for a meeting, approved!

Feb 2 2019, 9:21 PM · SRE, SRE-Access-Requests

Jan 26 2019

faidon added a comment to T214762: WMF's Grafana installation does not follow Wikimedia's visual identity guidelines.

It's tricky, but I think the one we use one is probably the right one and this should be declined. See T212674 for context.

Jan 26 2019, 9:58 PM · Observability-Metrics, SRE

Jan 25 2019

faidon updated subscribers of T205897: Netbox: fill network topology.

Netbox is now at 2.5 \o/ which allows us to import cable IDs, type, color etc. Let's start with importing eqsin's, with the data that we have in the spreadsheet, so that we can deprecate that? @RobH @ayounsi any takers?

Jan 25 2019, 3:11 AM · Infrastructure-Foundations, netbox, SRE

Jan 22 2019

faidon committed rOSKEYHOLDER0fcbce6cca70: Add tests for OSError when loading config files.

Add tests for OSError when loading config files

Advanced SearchUse ResultsEdit QueryHide Query

Apr 11 2019

Apr 9 2019

Apr 8 2019

Mar 21 2019

Mar 7 2019

Mar 5 2019

Mar 4 2019

Feb 20 2019

Feb 15 2019

Feb 14 2019

Feb 12 2019

Feb 7 2019

Feb 5 2019

Feb 2 2019

Jan 26 2019

Jan 25 2019

Jan 22 2019

Jan 21 2019

Jan 18 2019

Dec 21 2018

Dec 19 2018

Dec 18 2018

Dec 17 2018

Dec 14 2018

Dec 13 2018

Dec 12 2018

Dec 11 2018

Dec 10 2018

Dec 7 2018

Dec 6 2018

Dec 5 2018

Dec 4 2018

Nov 30 2018

Advanced Search
Use Results
Edit Query
Hide Query