Page MenuHomePhabricator

herron (Keith Herron)
Ops Engineer

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
May 30 2017, 5:25 PM (120 w, 4 d)
Availability
Available
IRC Nick
herron
LDAP User
Herron
MediaWiki User
Unknown

Recent Activity

Fri, Sep 20

herron triaged T233373: Sections on some mobile pages are not collabsable as Normal priority.
Fri, Sep 20, 7:13 PM · Operations, Traffic, MobileFrontend
herron triaged T233403: Unassigned shards in eqiad as High priority.
Fri, Sep 20, 7:12 PM · Discovery-Search, Operations, Elasticsearch
herron closed T231984: NDA Request from WMDE employee Raja as Resolved.

Hi @raja_wmde, you have been added to the NDA LDAP group.

Fri, Sep 20, 7:12 PM · Operations, LDAP-Access-Requests
herron closed T232489: Request access to 'deployment' user group for phedenskog as Resolved.
Fri, Sep 20, 1:06 PM · Operations, SRE-Access-Requests, Performance-Team

Thu, Sep 19

herron moved T233202: Requesting access to deployment for andrew-wmde from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Thu, Sep 19, 4:43 PM · SRE-Access-Requests, Operations
herron updated subscribers of T233202: Requesting access to deployment for andrew-wmde.

@greg could you please review/approve this request for deployment permissions?

Thu, Sep 19, 4:43 PM · SRE-Access-Requests, Operations
herron updated the task description for T233202: Requesting access to deployment for andrew-wmde.
Thu, Sep 19, 4:37 PM · SRE-Access-Requests, Operations

Wed, Sep 18

herron added a comment to T231387: Updating DNS records .

DNS records were added Monday at approx. 10:30am Eastern.

Wed, Sep 18, 8:32 PM · Mail, WMF-Communications, Operations
herron moved T233189: Requesting access to Ops Group for papaul@ from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Wed, Sep 18, 7:28 PM · Patch-For-Review, Operations, SRE-Access-Requests
herron triaged T232358: postgres::slave module type for includes parameter in inconsistent. as Normal priority.
Wed, Sep 18, 7:21 PM · Puppet, Operations, netbox
herron triaged T232795: We are not capturing IPs of original requests for proxied requests from operamini and googleweblight. x-forwarded-for is null and client-ip is the same as IP on Webrequest data as Normal priority.
Wed, Sep 18, 7:20 PM · Operations, Traffic, Analytics
herron triaged T231984: NDA Request from WMDE employee Raja as Normal priority.
Wed, Sep 18, 7:19 PM · Operations, LDAP-Access-Requests
herron awarded T232887: The phabricator server, WMF7426, was given to us temporarily, we would like to make it permanent a Like token.
Wed, Sep 18, 7:09 PM · Operations, hardware-requests, Release-Engineering-Team (Development services), serviceops, Phabricator
herron triaged T232887: The phabricator server, WMF7426, was given to us temporarily, we would like to make it permanent as Normal priority.
Wed, Sep 18, 7:09 PM · Operations, hardware-requests, Release-Engineering-Team (Development services), serviceops, Phabricator
herron triaged T232961: Wikispore mailing list as Normal priority.
Wed, Sep 18, 7:08 PM · Wikispore, Wikimedia-Mailing-lists, Operations
herron closed T233025: Upload zuul_2.5.1-wmf10 to apt.wikimedia.org, a subtask of T203846: Zuul cancels all changes when a change is manually merged, as Resolved.
Wed, Sep 18, 7:08 PM · Release-Engineering-Team-TODO (201909), Continuous-Integration-Infrastructure, Gerrit, Zuul
herron closed T233025: Upload zuul_2.5.1-wmf10 to apt.wikimedia.org as Resolved.
Wed, Sep 18, 7:08 PM · Operations, Zuul, Release-Engineering-Team-TODO (201909), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Infrastructure
herron triaged T233202: Requesting access to deployment for andrew-wmde as Normal priority.
Wed, Sep 18, 7:07 PM · SRE-Access-Requests, Operations
herron triaged T233215: ConfirmEdit seemingly erroneously enabled for some users on wikitech as Normal priority.
Wed, Sep 18, 7:07 PM · wikitech.wikimedia.org, Wikimedia-production-error, ConfirmEdit (CAPTCHA extension), Operations
herron triaged T233089: Export zuul metrics to Prometheus as Normal priority.
Wed, Sep 18, 7:06 PM · Patch-For-Review, Continuous-Integration-Infrastructure, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, observability, Operations
herron triaged T233047: Apache mod_status aggregator as Normal priority.
Wed, Sep 18, 7:06 PM · observability, Operations
herron added a comment to T233025: Upload zuul_2.5.1-wmf10 to apt.wikimedia.org.

Hey @hashar, zuul_2.5.1-wmf10 has been uploaded for jessie-wikimedia:

Wed, Sep 18, 6:54 PM · Operations, Zuul, Release-Engineering-Team-TODO (201909), Release-Engineering-Team (CI & Testing services), Continuous-Integration-Infrastructure
herron closed T232353: Remove mmarble from wmf LDAP group, a subtask of T232348: Offboard Michal Anna from Security Team, as Resolved.
Wed, Sep 18, 6:08 PM · Security-Team
herron closed T232353: Remove mmarble from wmf LDAP group as Resolved.

Patch has been merged, and also double checked that offboard-user is a NOOP. I think we're good here, but please re-open if any follow-up is needed.

Wed, Sep 18, 6:08 PM · Security-Team, LDAP-Access-Requests
herron moved T231984: NDA Request from WMDE employee Raja from Backlog to Manager Approval Pending on the LDAP-Access-Requests board.
Wed, Sep 18, 3:08 PM · Operations, LDAP-Access-Requests
herron added a comment to T231984: NDA Request from WMDE employee Raja.

Hi @raja_wmde could you please coordinate obtaining a comment of manager approval on this task? Thanks!

Wed, Sep 18, 3:08 PM · Operations, LDAP-Access-Requests
herron added a comment to T232353: Remove mmarble from wmf LDAP group.

ldap user mmarble appears to have already been removed from wmf.

Wed, Sep 18, 2:52 PM · Security-Team, LDAP-Access-Requests
herron moved T231616: Request access to Analytics cluster for Urbanecm from Manager/NDA Approval/Confirmation to In Discussion on the SRE-Access-Requests board.
Wed, Sep 18, 2:37 PM · Patch-For-Review, Operations, SRE-Access-Requests
herron added a comment to T231387: Updating DNS records .

Aliases have been added, dns records are live, and test mail to myself works. Please see https://phabricator.wikimedia.org/T231387#5488828 for currently open follow-up questions.

Wed, Sep 18, 2:23 PM · Mail, WMF-Communications, Operations

Tue, Sep 17

herron closed T232707: Requesting access to analytics cluster for Martin Gerlach as Resolved.

Hi Martin, this access is in place now. If any follow up is needed please don't hesitate to re-open. Thanks!

Tue, Sep 17, 7:44 PM · Analytics, Operations, SRE-Access-Requests
herron updated the task description for T232707: Requesting access to analytics cluster for Martin Gerlach.
Tue, Sep 17, 7:43 PM · Analytics, Operations, SRE-Access-Requests
herron moved T231616: Request access to Analytics cluster for Urbanecm from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Tue, Sep 17, 7:33 PM · Patch-For-Review, Operations, SRE-Access-Requests
herron moved T232489: Request access to 'deployment' user group for phedenskog from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Tue, Sep 17, 7:33 PM · Operations, SRE-Access-Requests, Performance-Team
herron added a comment to T231616: Request access to Analytics cluster for Urbanecm.

Uploaded a patch for this. But need approval from @Nuria before moving forward with it.

Tue, Sep 17, 7:32 PM · Patch-For-Review, Operations, SRE-Access-Requests
herron added a comment to T232489: Request access to 'deployment' user group for phedenskog.

Uploaded a patch for this. Once we have approval documented we should be able to move forward with it.

Tue, Sep 17, 7:20 PM · Operations, SRE-Access-Requests, Performance-Team
herron updated the task description for T232707: Requesting access to analytics cluster for Martin Gerlach.
Tue, Sep 17, 6:10 PM · Analytics, Operations, SRE-Access-Requests
herron updated the task description for T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].
Tue, Sep 17, 5:33 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations

Mon, Sep 16

herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

@Ottomata excellent thx for the heads up!

Mon, Sep 16, 1:20 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron updated the task description for T230236: De-noise ipsec alerts (Reduce Icinga alert noise goal).
Mon, Sep 16, 1:16 PM · Patch-For-Review, User-herron, Goal, observability

Fri, Sep 13

herron updated the task description for T205855: Investigate approaches to ingest sensitive log producers.
Fri, Sep 13, 6:11 PM · observability, Wikimedia-Logstash, Operations
herron added a comment to T205855: Investigate approaches to ingest sensitive log producers.

https://opendistro.github.io/for-elasticsearch/ appears to be a valid option, although this was resolved I'll update the description to include it

Fri, Sep 13, 6:07 PM · observability, Wikimedia-Logstash, Operations
herron added a comment to T213902: Implement sensitive logstash access control.

Open Distro for Elasticsearch looks quite promising https://opendistro.github.io/for-elasticsearch/

Fri, Sep 13, 6:06 PM · observability, Patch-For-Review, User-herron, Operations, Wikimedia-Logstash
herron added a comment to T230570: De-noise systemd alerts (Reduce Icinga alert noise goal).

Thanks for this feedback @Joe it's quite helpful! Sorry to bottom quote so much!

Fri, Sep 13, 3:27 PM · Patch-For-Review, Goal, observability

Thu, Sep 12

herron added a comment to T231387: Updating DNS records .

@mark you bet! I've uploaded a few patches to get mail flowing for this subdomain.

Thu, Sep 12, 7:25 PM · Mail, WMF-Communications, Operations
herron claimed T231387: Updating DNS records .
Thu, Sep 12, 2:27 PM · Mail, WMF-Communications, Operations

Thu, Sep 5

herron updated the task description for T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].
Thu, Sep 5, 5:33 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations

Wed, Sep 4

herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

The eventgate-main config now includes the new brokers in the broker list.

Wed, Sep 4, 5:28 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron committed rDEPLOYCHARTS36911988f48a: eventgate-main: add new brokers to staging broker list (authored by herron).
eventgate-main: add new brokers to staging broker list
Wed, Sep 4, 4:23 PM
herron committed rDEPLOYCHARTSf9263a6a87c3: eventgate-main: add new kafka-main brokers to broker list (authored by herron).
eventgate-main: add new kafka-main brokers to broker list
Wed, Sep 4, 3:46 PM

Aug 19 2019

herron created P8932 (An Untitled Masterwork).
Aug 19 2019, 7:13 PM

Aug 16 2019

herron added a comment to T230611: Puppet error on deployment-logtash03.

Not sure what caused the system to be in this state, but after the following steps logstash is back up and running.

Aug 16 2019, 1:33 PM · Puppet, Beta-Cluster-Infrastructure

Aug 15 2019

herron updated subscribers of T230570: De-noise systemd alerts (Reduce Icinga alert noise goal).

Since check systemd is a secondary monitor (important services are monitored via dedicated service specific checks) I think we can reduce the severity of the generic systemd alerts to warning and have them display in the icinga UI, without alerting on IRC.

Aug 15 2019, 7:20 PM · Patch-For-Review, Goal, observability
herron created T230570: De-noise systemd alerts (Reduce Icinga alert noise goal).
Aug 15 2019, 7:15 PM · Patch-For-Review, Goal, observability

Aug 14 2019

herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

Andrew is on holidays, but it looks good to me!

Ok! Will plan to migrate kafka1001 -> kafka-main1001 tomorrow morning Eastern time

Aug 14 2019, 4:21 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

Andrew is on holidays, but it looks good to me!

Aug 14 2019, 3:57 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].

Sounds good. How do the steps and ordering look in https://docs.google.com/document/d/1o7bl1WBzSMymsXGzhWLy1GmMOSo_Evof1PHk2aQXAiE ? And what is the process to deploy the new eventgate-main config?

Aug 14 2019, 3:50 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron created T230492: Requesting SRE permissions to create Gerrit projects under operations/debs.
Aug 14 2019, 3:38 PM · Gerrit-Privilege-Requests

Aug 13 2019

herron created T230443: Requesting new gerrit project repository "operations/debs/prometheus-ipsec-exporter".
Aug 13 2019, 7:02 PM · User-MarcoAurelio, Repository-Admins

Aug 9 2019

herron added a comment to T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].
Aug 9 2019, 8:40 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron renamed T225005: Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345] from Replace and expand codfw kafka main hosts (kafka200[123]) with kafka-main200[12345] to Replace and expand kafka main hosts (kafka[12]00[123]) with kafka-main[12]00[12345].
Aug 9 2019, 8:25 PM · Patch-For-Review, Services (watching), Core Platform Team Legacy (Watching / External), Analytics, User-herron, Operations
herron added a comment to T230236: De-noise ipsec alerts (Reduce Icinga alert noise goal).

I've drafted a prometheus-ipsec-exporter package based on https://github.com/dennisstritzke/ipsec_exporter on boron.

Aug 9 2019, 8:08 PM · Patch-For-Review, User-herron, Goal, observability
herron created T230236: De-noise ipsec alerts (Reduce Icinga alert noise goal).
Aug 9 2019, 7:59 PM · Patch-For-Review, User-herron, Goal, observability

Jul 30 2019

herron closed T222308: Close the engineering mailing list as Resolved.

List has been closed!

Jul 30 2019, 4:47 PM · Operations, Wikimedia-Mailing-lists

Jul 29 2019

herron updated the task description for T228878: Reduce Icinga alert noise.
Jul 29 2019, 6:26 PM · User-fgiunchedi, Goal, observability
herron created T229262: De-noise puppet failed runs (Reduce Icinga alert noise goal).
Jul 29 2019, 6:25 PM · User-fgiunchedi, Goal, observability
herron added a comment to T229124: add jclark to datacenter-ops group.

Hey @RobH, Cross-validate accounts started sending notifications for:

Jul 29 2019, 3:43 PM · Operations, SRE-Access-Requests

Jul 26 2019

herron updated the task description for T228379: Improve our alerting capabilities (Q1 goal FY19-20).
Jul 26 2019, 4:31 PM · User-herron, User-fgiunchedi, Goal, observability
herron added a project to T228379: Improve our alerting capabilities (Q1 goal FY19-20): User-herron.
Jul 26 2019, 4:31 PM · User-herron, User-fgiunchedi, Goal, observability
herron moved T228879: Produce and circulate an alerting roadmap from Backlog to In progress on the observability board.
Jul 26 2019, 4:30 PM · User-fgiunchedi, Goal, observability
herron moved T228878: Reduce Icinga alert noise from Backlog to In progress on the observability board.
Jul 26 2019, 4:30 PM · User-fgiunchedi, Goal, observability
herron moved T228880: Establish periodic alerts reviews, complete one by EOQ from Backlog to In progress on the observability board.
Jul 26 2019, 4:30 PM · User-fgiunchedi, Goal, observability
herron triaged T228854: Use git commit id as "configuration version" for puppet as Normal priority.
Jul 26 2019, 4:29 PM · Operations, observability, Puppet
herron triaged T229096: Provide the three cert types (chain-only, cert only and chained) as soon as we get the certificate issued as Normal priority.
Jul 26 2019, 4:26 PM · Acme-chief, Traffic, Operations
herron triaged T229101: Phase monitoring for new PDUs as Normal priority.
Jul 26 2019, 4:26 PM · User-fgiunchedi, Patch-For-Review, observability, DC-Ops, Operations
herron triaged T229117: create swift container-to-container synchronization metrics as Normal priority.
Jul 26 2019, 4:25 PM · Release-Engineering-Team-TODO, Operations, Wikimedia-Incident, serviceops
herron triaged T229118: create a docker_registry_codfw swift container backup as Normal priority.
Jul 26 2019, 4:25 PM · Release-Engineering-Team-TODO, Operations, Wikimedia-Incident, serviceops
sguebo_WMF awarded T228927: Add sguebo_WMF to WMF LDAP group a Like token.
Jul 26 2019, 4:07 PM · LDAP-Access-Requests, Trust-and-Safety, Security-Team, Operations
herron assigned T227695: Requesting access to analytics-privatedata-users for mbsantos to Nuria.
Jul 26 2019, 3:16 PM · Operations, SRE-Access-Requests
herron closed T228927: Add sguebo_WMF to WMF LDAP group as Resolved.

@sguebo_WMF (LDAP username sguebo) has been added to the wmf ldap group

Jul 26 2019, 3:15 PM · LDAP-Access-Requests, Trust-and-Safety, Security-Team, Operations

Jul 25 2019

herron closed T227200: Requesting access to analytics-privatedata-users for DLynch as Resolved.

Hi David, the requested access has been provisioned. I'll transition this to resolved now as a soft close. Please don't hesitate to re-open if any follow-up is needed. Thanks!

Jul 25 2019, 6:49 PM · VisualEditor (Current work), Operations, SRE-Access-Requests
herron closed T227496: Access to WikimediaFoundation.org analytics for Deb as Resolved.

staff members need to be a member of cn=wmf, cn=nda is for people who have access to PII-relevant data, but are not staff members

Jul 25 2019, 6:03 PM · Operations, LDAP-Access-Requests, wikimediafoundation.org, Analytics
herron moved T228447: Requesting access to machines [stat1004, stat1007, stat1006, notebook1003 and notebook1004] and groups for cchen from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Jul 25 2019, 5:54 PM · Operations, SRE-Access-Requests
herron merged task T229028: Requesting production shell access for DLynch into T227200: Requesting access to analytics-privatedata-users for DLynch.
Jul 25 2019, 5:44 PM · SRE-Access-Requests, Operations
herron merged T229028: Requesting production shell access for DLynch into T227200: Requesting access to analytics-privatedata-users for DLynch.
Jul 25 2019, 5:43 PM · VisualEditor (Current work), Operations, SRE-Access-Requests
herron closed T229028: Requesting production shell access for DLynch as Resolved.

Hey David, sorry for the confusion. T227200 should be sufficient for both shell access and group membership. So I'll merge these tasks, and move forward with the patch uploaded referencing T227200.

Jul 25 2019, 5:43 PM · SRE-Access-Requests, Operations
herron moved T227695: Requesting access to analytics-privatedata-users for mbsantos from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Jul 25 2019, 5:32 PM · Operations, SRE-Access-Requests
herron updated subscribers of T228447: Requesting access to machines [stat1004, stat1007, stat1006, notebook1003 and notebook1004] and groups for cchen.

Looping in @Nuria for analytics review/approval

Jul 25 2019, 4:40 PM · Operations, SRE-Access-Requests

Jul 24 2019

herron closed T227496: Access to WikimediaFoundation.org analytics for Deb as Resolved.

Great! Thanks all

Jul 24 2019, 5:30 PM · Operations, LDAP-Access-Requests, wikimediafoundation.org, Analytics
herron added a comment to T228733: Add more SREs to gerritadmin LDAP group.

@Joe @Dzahn adding you both to gerritadmin would satisfy "at least 1 person from US TZs and 1 from EU TZs"

Jul 24 2019, 3:47 PM · Release-Engineering-Team-TODO, Release-Engineering-Team (Development services), Gerrit, LDAP-Access-Requests, Operations

Jul 23 2019

herron created P8790 (An Untitled Masterwork).
Jul 23 2019, 9:21 PM
herron closed T228673: Requesting new Phabricator tag "Observability-Goal" as Resolved.

Thanks @Aklapper! The link on the sidebar of observability solves this quite nicely.

Jul 23 2019, 4:51 PM · Project-Admins
herron closed T227496: Access to WikimediaFoundation.org analytics for Deb as Resolved.

I wasn't able to find an ldap account with shell username Deb_Zierten, but I do see shell username dz1 associated with Deb's wikitech account and wmf email address.

Jul 23 2019, 2:28 PM · Operations, LDAP-Access-Requests, wikimediafoundation.org, Analytics
herron triaged T228275: Use centrallog1001 for network devices syslog as Normal priority.
Jul 23 2019, 2:04 PM · netops, User-fgiunchedi, Operations
herron triaged T228395: puppetdb prometheus metrics per-host metrics as Normal priority.
Jul 23 2019, 2:03 PM · User-fgiunchedi, Operations, Puppet
herron triaged T228617: AS63541's session down reported by cr1-eqsin as Normal priority.
Jul 23 2019, 2:01 PM · netops, Operations
herron triaged T228732: Upgrade db1100 firmware and BIOS as Normal priority.
Jul 23 2019, 1:57 PM · DBA, ops-eqiad, Operations

Jul 22 2019

herron added a comment to T228673: Requesting new Phabricator tag "Observability-Goal".

Is this somehow related to https://phabricator.wikimedia.org/project/profile/84/ ? If yes, in which relation are they (subproject maybe?) and why is the observability workboard (e.g. combining some observability tasks with a Goal tag and then filtering the workboard view on that) not sufficient?

Jul 22 2019, 7:04 PM · Project-Admins
herron added a comment to T227714: [betacluster] Cannot confirm email address - confirmation never received.

I'm not having luck reproducing this with my own non-wikimedia.org personal email account.

Jul 22 2019, 5:19 PM · Growth-Team (Current Sprint), Operations, Release-Engineering-Team (Other / Uncategorized), Mail, Beta-Cluster-Infrastructure, MediaWiki-Email
herron added a comment to T227714: [betacluster] Cannot confirm email address - confirmation never received.

@greg sure, I'm back today from being out of the office last week, and will try to reproduce this and trace the emails to see whats happening.

Jul 22 2019, 4:47 PM · Growth-Team (Current Sprint), Operations, Release-Engineering-Team (Other / Uncategorized), Mail, Beta-Cluster-Infrastructure, MediaWiki-Email
herron created T228673: Requesting new Phabricator tag "Observability-Goal".
Jul 22 2019, 3:03 PM · Project-Admins