Page MenuHomePhabricator

colewhite (cwhite)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Sunday

  • Clear sailing ahead.

User Details

User Since
Aug 21 2018, 6:05 PM (73 w, 3 d)
Availability
Available
LDAP User
Cwhite
MediaWiki User
Unknown

Recent Activity

Fri, Dec 20

colewhite closed T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick as Resolved.
Fri, Dec 20, 11:53 PM · Operations, SRE-Access-Requests
colewhite closed T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick, a subtask of T240739: Onboarding Checklist for Shay Nowick, as Resolved.
Fri, Dec 20, 11:53 PM · Product-Analytics (Kanban)
colewhite added a comment to T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.

Thank you!

Fri, Dec 20, 11:53 PM · Operations, SRE-Access-Requests
colewhite updated the task description for T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.
Fri, Dec 20, 11:47 PM · Operations, SRE-Access-Requests
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Fri, Dec 20, 8:49 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a comment to T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.

Just to confirm the SSH key, could you paste it into a comment here and sign the comment with MFA?

Fri, Dec 20, 12:15 AM · Operations, SRE-Access-Requests

Thu, Dec 19

colewhite updated the task description for T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.
Thu, Dec 19, 11:40 PM · Operations, SRE-Access-Requests
colewhite updated the task description for T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.
Thu, Dec 19, 11:35 PM · Operations, SRE-Access-Requests
colewhite claimed T240917: Requesting access to analytics-privatedata-users, researchers & wmf for Shay Nowick.
Thu, Dec 19, 11:34 PM · Operations, SRE-Access-Requests
colewhite added a subtask for T205870: Fully migrate producers off statsd: T241176: Review and release service-runner 2.8.0.
Thu, Dec 19, 9:06 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a parent task for T241176: Review and release service-runner 2.8.0: T205870: Fully migrate producers off statsd.
Thu, Dec 19, 9:05 PM · Core Platform Team Workboards (Clinic Duty Team), service-runner
colewhite created T241176: Review and release service-runner 2.8.0.
Thu, Dec 19, 9:04 PM · Core Platform Team Workboards (Clinic Duty Team), service-runner
colewhite updated subscribers of T240685: MediaWiki Prometheus support.

We recently had a conversation about this.

Thu, Dec 19, 8:54 PM · Operations, MediaWiki-General, observability
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Thu, Dec 19, 7:22 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a comment to T236954: Hieradata yaml style checking.

Great idea. Lets raise it at the next SRE meeting.

Thu, Dec 19, 3:54 PM · Patch-For-Review, Puppet, Operations, User-jbond

Wed, Dec 18

colewhite added a comment to T240870: Audit the WMF LDAP group and limit its permissions.

@jcrespo that sounds bad to me. Perhaps query monitoring is a great candidate for a more specific and limited group?

Wed, Dec 18, 11:40 PM · Operations

Dec 17 2019

colewhite added a comment to T236954: Hieradata yaml style checking.

The changesets look great and appear to do the right thing.

Dec 17 2019, 11:51 PM · Patch-For-Review, Puppet, Operations, User-jbond
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 17 2019, 11:44 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 17 2019, 10:13 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a subtask for T205870: Fully migrate producers off statsd: T240995: AQS is not OpenAPI 3 compliant.
Dec 17 2019, 9:12 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a parent task for T240995: AQS is not OpenAPI 3 compliant: T205870: Fully migrate producers off statsd.
Dec 17 2019, 9:12 PM · Patch-For-Review, Analytics
colewhite created T240995: AQS is not OpenAPI 3 compliant.
Dec 17 2019, 9:08 PM · Patch-For-Review, Analytics
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 17 2019, 12:26 AM · Performance-Team (Radar), Patch-For-Review, observability, Operations

Dec 16 2019

colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 16 2019, 10:17 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 16 2019, 9:02 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 16 2019, 7:52 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite closed T238807: Clean up ORES metrics as Resolved.
Dec 16 2019, 5:06 PM · observability, Operations
colewhite triaged T240870: Audit the WMF LDAP group and limit its permissions as Low priority.
Dec 16 2019, 4:45 PM · Operations
colewhite created T240870: Audit the WMF LDAP group and limit its permissions.
Dec 16 2019, 4:45 PM · Operations

Dec 13 2019

colewhite added a subtask for T205870: Fully migrate producers off statsd: T240685: MediaWiki Prometheus support.
Dec 13 2019, 3:41 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a parent task for T240685: MediaWiki Prometheus support: T205870: Fully migrate producers off statsd.
Dec 13 2019, 3:41 PM · Operations, MediaWiki-General, observability
colewhite triaged T240685: MediaWiki Prometheus support as Medium priority.
Dec 13 2019, 3:40 PM · Operations, MediaWiki-General, observability
colewhite created T240685: MediaWiki Prometheus support.
Dec 13 2019, 3:40 PM · Operations, MediaWiki-General, observability

Dec 12 2019

colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 12 2019, 11:53 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 12 2019, 11:13 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations

Dec 11 2019

colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 11 2019, 5:04 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 11 2019, 4:49 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations

Dec 10 2019

colewhite updated the task description for T205870: Fully migrate producers off statsd.
Dec 10 2019, 9:42 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations

Dec 9 2019

colewhite added a comment to T238807: Clean up ORES metrics.

needs to be done in codfw as well

Dec 9 2019, 9:45 PM · observability, Operations
colewhite reopened T238807: Clean up ORES metrics as "Open".
Dec 9 2019, 9:45 PM · observability, Operations

Dec 6 2019

colewhite closed T239881: LDAP access to the wmf group for Danny Horn as Resolved.
Dec 6 2019, 11:19 PM · Operations, LDAP-Access-Requests
colewhite triaged T239993: Decom LVS recdns as Medium priority.
Dec 6 2019, 11:19 PM · Patch-For-Review, Operations, Traffic

Dec 5 2019

colewhite moved T239881: LDAP access to the wmf group for Danny Horn from Backlog to Awaiting User Input on the LDAP-Access-Requests board.
Dec 5 2019, 11:49 PM · Operations, LDAP-Access-Requests
colewhite closed T239494: Requesting access to LogStash for rxy as Resolved.
Dec 5 2019, 7:45 PM · SRE-Access-Requests, Operations
colewhite added a comment to T239494: Requesting access to LogStash for rxy.

@Rxy I've added you to the NDA group which should grant you access to Logstash. Please let me know if you encounter any related issue.

Dec 5 2019, 7:45 PM · SRE-Access-Requests, Operations
colewhite added a comment to T239881: LDAP access to the wmf group for Danny Horn.

@DannyH I've moved ahead and added you to the wmf ldap group on the basis of your status as staff. We still need to know what you need this access for though.

Dec 5 2019, 7:44 PM · Operations, LDAP-Access-Requests
colewhite triaged T239881: LDAP access to the wmf group for Danny Horn as Medium priority.
Dec 5 2019, 6:01 PM · Operations, LDAP-Access-Requests
colewhite triaged T239805: ms-fe2007 NIC failure as Medium priority.
Dec 5 2019, 5:59 PM · User-fgiunchedi, ops-codfw, Operations
colewhite triaged T239832: Fix installation of Puppet 5/Facter 3 on new stretch installs/reimages as Medium priority.
Dec 5 2019, 5:58 PM · Operations
colewhite triaged T239880: Replacement hardware for buster/stretch upgrade of contint1001 and contint2001 as Medium priority.
Dec 5 2019, 5:58 PM · Continuous-Integration-Infrastructure (phase-out-jessie), DC-Ops, hardware-requests, Operations
colewhite triaged T239893: BGP peering sessions with corp partially down in ulsfo as Medium priority.
Dec 5 2019, 5:58 PM · Operations, netops
colewhite triaged T239896: Facebook BGP peering links down in ulsfo as Medium priority.
Dec 5 2019, 5:55 PM · Operations, netops
colewhite triaged T239901: Disallow 'weight: 0' for MW db config in dbctl as Medium priority.
Dec 5 2019, 5:55 PM · Operations, DBA, Wikimedia-Incident
colewhite added a comment to T239874: MediaWiki: "host db1062 is unreachable" (Connection refused).

It seems clear that db1062 shouldn't be pooled anywhere. Ran the dbctl depool utility and it's gone from s7.

Dec 5 2019, 12:48 AM · DBA, Wikimedia-production-error
colewhite added a comment to T233448: Review prometheus ORES rules for completeness.

Since it's not used in dashboards, what do we do with the model? I imagine it's useful, but I'm not sure how.

Dec 5 2019, 12:17 AM · Patch-For-Review, ORES, Scoring-platform-team

Dec 4 2019

colewhite closed T239654: Requesting access to production shell for Maryum Styles, a subtask of T239300: Add Maryum to Puppet, as Resolved.
Dec 4 2019, 9:49 PM · Patch-For-Review, Operations, Discovery-Search (Current work)
colewhite closed T239654: Requesting access to production shell for Maryum Styles as Resolved.
Dec 4 2019, 9:49 PM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite added a comment to T239654: Requesting access to production shell for Maryum Styles.

@Mstyles is now in the wmf ldap group. Please let me know if you encounter any related issue.

Dec 4 2019, 9:49 PM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite triaged T239833: StatsD Exporter does not relay dropped metrics as Medium priority.
Dec 4 2019, 8:17 PM · Patch-For-Review, observability, Operations
colewhite added a parent task for T239833: StatsD Exporter does not relay dropped metrics: T233448: Review prometheus ORES rules for completeness.
Dec 4 2019, 8:17 PM · Patch-For-Review, observability, Operations
colewhite added a subtask for T233448: Review prometheus ORES rules for completeness: T239833: StatsD Exporter does not relay dropped metrics.
Dec 4 2019, 8:17 PM · Patch-For-Review, ORES, Scoring-platform-team
colewhite added a comment to T233448: Review prometheus ORES rules for completeness.

I did more research and found a usage pattern that didn't initially occur to me.

Dec 4 2019, 8:17 PM · Patch-For-Review, ORES, Scoring-platform-team
colewhite created T239833: StatsD Exporter does not relay dropped metrics.
Dec 4 2019, 4:04 PM · Patch-For-Review, observability, Operations
colewhite claimed T239654: Requesting access to production shell for Maryum Styles.
Dec 4 2019, 3:42 AM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite claimed T239494: Requesting access to LogStash for rxy.
Dec 4 2019, 3:41 AM · SRE-Access-Requests, Operations
colewhite triaged T239300: Add Maryum to Puppet as Medium priority.
Dec 4 2019, 3:41 AM · Patch-For-Review, Operations, Discovery-Search (Current work)
colewhite triaged T239586: Add latest jenkins debian packages to apt.wikimedia.org and upgrade jenkins to latest LTS (2.190.3) as Medium priority.
Dec 4 2019, 3:40 AM · Jenkins, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Operations
colewhite triaged T239711: Make DNS operations resilient against predictable failures as Medium priority.
Dec 4 2019, 3:39 AM · Operations, Traffic

Dec 3 2019

colewhite moved T239494: Requesting access to LogStash for rxy from Untriaged to Manager/NDA Approval/Confirmation on the SRE-Access-Requests board.
Dec 3 2019, 12:32 AM · SRE-Access-Requests, Operations
colewhite moved T239654: Requesting access to production shell for Maryum Styles from Untriaged to Awaiting User Input on the SRE-Access-Requests board.
Dec 3 2019, 12:31 AM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite triaged T239654: Requesting access to production shell for Maryum Styles as Medium priority.
Dec 3 2019, 12:30 AM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite added a comment to T239654: Requesting access to production shell for Maryum Styles.

Hi Maryum!

Dec 3 2019, 12:30 AM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite updated the task description for T239654: Requesting access to production shell for Maryum Styles.
Dec 3 2019, 12:28 AM · Discovery-Search (Current work), Operations, SRE-Access-Requests
colewhite closed T234429: Requesting access to view EventLogging data for Co_WMDE as Resolved.
Dec 3 2019, 12:20 AM · WMF-Legal, Operations, SRE-Access-Requests
colewhite added a comment to T234429: Requesting access to view EventLogging data for Co_WMDE.

The necessary changes have been deployed. Please let me know if you encounter any related issue.

Dec 3 2019, 12:19 AM · WMF-Legal, Operations, SRE-Access-Requests

Dec 2 2019

colewhite added a comment to T215904: Better understanding of Logstash performance.

After this, Logstash seems to be processing 3x more logs at peak.

Dec 2 2019, 8:52 PM · User-fgiunchedi, observability, Wikimedia-Logstash
colewhite reopened T234429: Requesting access to view EventLogging data for Co_WMDE as "Open".
Dec 2 2019, 8:39 PM · WMF-Legal, Operations, SRE-Access-Requests
colewhite updated subscribers of T234429: Requesting access to view EventLogging data for Co_WMDE.

Hi Corinna!

Dec 2 2019, 8:39 PM · WMF-Legal, Operations, SRE-Access-Requests

Nov 26 2019

colewhite added a comment to T215904: Better understanding of Logstash performance.

I gathered some data and graphs and now I don't think the GC is the issue.

Nov 26 2019, 8:04 PM · User-fgiunchedi, observability, Wikimedia-Logstash

Nov 25 2019

colewhite added a parent task for T236505: Monitor mailman outbound mail queue: T230030: Prometheus failing to ingest some mtail samples.
Nov 25 2019, 4:08 PM · observability, Operations
colewhite added a subtask for T230030: Prometheus failing to ingest some mtail samples: T236505: Monitor mailman outbound mail queue.
Nov 25 2019, 4:08 PM · observability
fgiunchedi awarded T238807: Clean up ORES metrics a Like token.
Nov 25 2019, 10:25 AM · observability, Operations

Nov 22 2019

colewhite closed T238807: Clean up ORES metrics as Resolved.
Nov 22 2019, 6:00 PM · observability, Operations
colewhite reopened T238807: Clean up ORES metrics as "Open".
Nov 22 2019, 5:08 PM · observability, Operations
colewhite closed T238807: Clean up ORES metrics as Resolved.
Nov 22 2019, 4:53 PM · observability, Operations
colewhite added a comment to T238807: Clean up ORES metrics.

Initial clean up is done. Last thing is to clean the tombstones.

Nov 22 2019, 6:55 AM · observability, Operations
colewhite claimed T238807: Clean up ORES metrics.
Nov 22 2019, 3:43 AM · observability, Operations

Nov 21 2019

colewhite updated the task description for T205870: Fully migrate producers off statsd.
Nov 21 2019, 12:37 AM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite updated subscribers of T238807: Clean up ORES metrics.

@fgiunchedi, would you mind having a quick look at P9701? I'd like to run it on production.

Nov 21 2019, 12:33 AM · observability, Operations
colewhite created P9701 cleanup_ores_metrics.sh.
Nov 21 2019, 12:31 AM · Operations, observability
colewhite triaged T238807: Clean up ORES metrics as Low priority.
Nov 21 2019, 12:30 AM · observability, Operations
colewhite created T238807: Clean up ORES metrics.
Nov 21 2019, 12:30 AM · observability, Operations

Nov 14 2019

colewhite added a comment to T230030: Prometheus failing to ingest some mtail samples.

I'm seeing this same issue with the new mailman metrics.

Nov 14 2019, 7:55 PM · observability

Nov 8 2019

colewhite created T237706: Phatality deployments invoke oom-killer on logstash::collector nodes..
Nov 8 2019, 1:27 AM · Operations, observability

Nov 7 2019

colewhite added a comment to T234565: Standardize the logging format.

@Krinkle T180051 IMHO implies a different solution. That task, as well as speeding up Kibana, would be accomplished with the work intended here. The last comment from @Eevans lines up with the intent of this task.

Nov 7 2019, 11:39 PM · Wikimedia-Logstash, observability, Operations

Nov 6 2019

colewhite renamed T205870: Fully migrate producers off statsd from Fully migrate >= 30% of producers off statsd to Fully migrate producers off statsd.
Nov 6 2019, 4:37 PM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a subtask for T205870: Fully migrate producers off statsd: T233448: Review prometheus ORES rules for completeness.
Nov 6 2019, 12:15 AM · Performance-Team (Radar), Patch-For-Review, observability, Operations
colewhite added a parent task for T233448: Review prometheus ORES rules for completeness: T205870: Fully migrate producers off statsd.
Nov 6 2019, 12:15 AM · Patch-For-Review, ORES, Scoring-platform-team
colewhite added a comment to T233448: Review prometheus ORES rules for completeness.

If the statsd-exporter sidecar approach is appropriate for ORES, there are quite a few metrics with unclear type and meaning. I've constructed a tree to assist us in defining them.

Nov 6 2019, 12:07 AM · Patch-For-Review, ORES, Scoring-platform-team

Oct 30 2019

colewhite triaged T236954: Hieradata yaml style checking as Low priority.
Oct 30 2019, 8:31 PM · Patch-For-Review, Puppet, Operations, User-jbond