Page MenuHomePhabricator

lmata (Leo Mata)
Manager SRE Observability

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
May 14 2020, 7:26 PM (23 w, 3 d)
Availability
Available
IRC Nick
lmata
LDAP User
LMata
MediaWiki User
LMata (WMF) [ Global Accounts ]

Recent Activity

Tue, Oct 20

lmata updated lmata.
Tue, Oct 20, 4:16 AM
lmata updated lmata.
Tue, Oct 20, 4:16 AM

Mon, Oct 19

lmata moved T263103: Compress graphite carbon-cache log files from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:36 PM · Patch-For-Review, observability
lmata moved T263027: Missing 'notify' for some Icinga configuration files from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:35 PM · Operations, observability
lmata added a comment to T184086: Add prometheus exporter to Gerrit.

moving to radar but probably will close eventually as the Gitlab move progresses

Mon, Oct 19, 3:34 PM · observability, Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, Gerrit, Operations
lmata moved T184086: Add prometheus exporter to Gerrit from Inbox to Radar on the observability board.
Mon, Oct 19, 3:33 PM · observability, Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, Gerrit, Operations
lmata moved T182759: Add Prometheus exporter to Jenkins instances from Inbox to Radar on the observability board.
Mon, Oct 19, 3:32 PM · observability, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, Continuous-Integration-Infrastructure, User-fgiunchedi, Goal, Operations
lmata moved T209709: Feature: enable prometheus-nginx-exporter for nginx metrics from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:32 PM · observability
lmata moved T263624: Grafana error: "parse error at char 1: unexpected character: '\\ufeff'" when copy-pasting metric names from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:31 PM · Operations, observability
lmata moved T263720: Notification spam from "last puppet run" upon re-enabling puppet from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:30 PM · Operations, Puppet, observability
lmata moved T148976: Strongswan Icinga check: do not report issues about depooled hosts from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:28 PM · Patch-For-Review, serviceops, observability, Operations
lmata moved T264016: Host page did not auto-resolve in VO from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:27 PM · observability
lmata moved T264300: active/active links monitoring from Inbox to Backlog on the observability board.
Mon, Oct 19, 3:27 PM · observability, Operations, netops
lmata moved T265632: Alert on Thanos sidecar not uploading blocks from Inbox to In progress on the observability board.
Mon, Oct 19, 3:21 PM · observability
lmata updated subscribers of T265649: PuppetDB grafana graphs not matching logs.
Mon, Oct 19, 3:20 PM · User-fgiunchedi, User-jbond, Operations, observability, Puppet
lmata moved T265712: Sync users and permissions from LDAP to Grafana from Inbox to In progress on the observability board.
Mon, Oct 19, 3:18 PM · User-fgiunchedi, observability, Operations
lmata added a comment to T97861: Provide centralized logging (logstash) for Toolforge.

Hello, Is there something for us (o11y) here or should we just stay in the loop for potential collaboration? Subscribing and radar for now.

Mon, Oct 19, 3:13 PM · observability, cloud-services-team (Kanban), Toolforge
lmata moved T97861: Provide centralized logging (logstash) for Toolforge from Inbox to Radar on the observability board.
Mon, Oct 19, 3:13 PM · observability, cloud-services-team (Kanban), Toolforge
lmata moved T97861: Provide centralized logging (logstash) for Toolforge from Radar to Inbox on the observability board.
Mon, Oct 19, 3:12 PM · observability, cloud-services-team (Kanban), Toolforge
lmata moved T97861: Provide centralized logging (logstash) for Toolforge from Inbox to Radar on the observability board.
Mon, Oct 19, 3:11 PM · observability, cloud-services-team (Kanban), Toolforge
lmata moved T203191: prometheus-varnish-exporter@frontend.service: Unit entered failed state - invalid character 'C' from Inbox to Radar on the observability board.
Mon, Oct 19, 3:11 PM · observability, Traffic, Operations

Tue, Oct 13

mmodell awarded T264667: Define distributed RPC/Request TRACING strategy and tooling recommendation a Love token.
Tue, Oct 13, 3:46 PM · observability

Mon, Oct 5

lmata renamed T264667: Define distributed RPC/Request TRACING strategy and tooling recommendation from Define & implement TRACING strategy and tooling to Define TRACING strategy and tooling recommendation.
Mon, Oct 5, 9:04 PM · observability
lmata triaged T264667: Define distributed RPC/Request TRACING strategy and tooling recommendation as Medium priority.
Mon, Oct 5, 9:04 PM · observability
lmata created T264667: Define distributed RPC/Request TRACING strategy and tooling recommendation.
Mon, Oct 5, 8:46 PM · observability
lmata added a comment to T706: Requests for addition to the #acl*Project-Admins group (in comments).

I do not see myself on this list for acl*Project-Admins
Is there anything additional I should do on my end? Maybe I have missed some steps. please advise, thanks!

Mon, Oct 5, 4:51 AM · Project-Admins

Wed, Sep 30

lmata moved T263747: Upgrade Grafana to 7.2 from Inbox to Backlog on the observability board.
Wed, Sep 30, 12:04 AM · observability

Sep 24 2020

lmata added a comment to T706: Requests for addition to the #acl*Project-Admins group (in comments).

Thanks! I'll be careful :-)

Sep 24 2020, 5:06 PM · Project-Admins
lmata added a comment to T706: Requests for addition to the #acl*Project-Admins group (in comments).

Hi I need to be able to add milestones and tags for the (SRE Observability) work streams around the observability work-board and project

Sep 24 2020, 5:02 PM · Project-Admins
lmata closed T263771: request for project creation rights for user lmata as Invalid.

noted, closing this one and moving to T706

Sep 24 2020, 5:01 PM · Phabricator
lmata created T263771: request for project creation rights for user lmata .
Sep 24 2020, 4:42 PM · Phabricator

Sep 21 2020

lmata moved T175636: prometheus -> grafana stats for per-numa-node meminfo from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:29 PM · Patch-For-Review, observability, Traffic, Operations
lmata moved T169860: Investigate/setup prometheus blackbox_exporter from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:29 PM · observability, User-fgiunchedi, Patch-For-Review, Prometheus-metrics-monitoring
lmata moved T219919: Move citoid logging to new logging pipeline from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:29 PM · observability, Citoid, Platform Team Legacy (Watching / External), Services (watching), service-runner, Wikimedia-Logstash, Operations
lmata moved T222377: Move kartotherian/tilerator logging to new logging pipeline from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Product-Infrastructure-Team-Backlog (Kanban), observability, Platform Team Legacy (Watching / External), Services (watching), Maps, service-runner, Wikimedia-Logstash, Operations
lmata moved T192948: Upgrade prometheus-jmx-exporter on all services using it from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Analytics-Radar, Platform Team Legacy (Watching / External), User-Elukey, observability, Puppet, Services (watching), Cassandra
lmata moved T245603: Move termbox to the logging pipeline from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Wikidata-Termbox, observability, Wikimedia-Logstash
lmata moved T245604: Move wikifeeds to the logging pipeline from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Wikifeeds, observability, Wikimedia-Logstash, Operations
lmata moved T233089: Export zuul metrics to Prometheus from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Patch-For-Review, Continuous-Integration-Infrastructure, Release-Engineering-Team (CI & Testing services), Release-Engineering-Team-TODO, observability, Operations
lmata moved T140282: Create a grafana dashboard for logstash*.eqiad.wmnet based on search dashboards from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · observability, Wikimedia-Logstash
lmata moved T240685: MediaWiki Prometheus support from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · Patch-For-Review, serviceops, Operations, MediaWiki-General, observability
lmata moved T255124: Invalid apache configuration on profile::prometheus::ops hosts from Externally blocked to Radar on the observability board.
Sep 21 2020, 8:28 PM · User-jbond, Patch-For-Review, Operations, observability
lmata moved T210137: Handle unknown stats in rsyslog_exporter from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · User-jijiki, serviceops, observability
lmata moved T207292: Review prometheus_nodes params from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · User-fgiunchedi, observability, Operations
lmata moved T141324: Look into shoving gerrit logs into logstash from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · Release-Engineering-Team (Development services), Release-Engineering-Team-TODO, observability, Technical-Debt, Wikimedia-Logstash, Gerrit
lmata moved T166107: Cleanup old logstash logs (application and JVM GC) from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · observability, Wikimedia-Logstash
lmata moved T180051: Reduce the number of fields declared in elasticsearch by logstash from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · Patch-For-Review, observability, Platform Team Legacy (Watching / External), Services (watching), Operations, Wikimedia-Logstash
lmata moved T189333: Changing Kibana filters is ridiculously slow from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · Developer Productivity, User-fgiunchedi, observability, Traffic, Operations, User-Addshore, Wikimedia-Logstash
lmata moved T176335: logs sent to logstash are lost when the elasticsearch cirrus cluster is unavailable from Up next to Backlog on the observability board.
Sep 21 2020, 8:27 PM · observability, Platform Team Legacy (Watching / External), Services (watching), Operations, Elasticsearch, Wikimedia-Logstash
lmata moved T247962: Migrate role::prometheus to Buster from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · observability
lmata moved T256418: Evaluate alternative to Logstash StatsD outputs from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · Patch-For-Review, Wikimedia-Logstash, observability
lmata moved T256954: Port Prometheus dashboards to Thanos from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · User-fgiunchedi, observability, Operations
lmata moved T251644: Icinga refresh hardware selection (2020) from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · observability, Operations
lmata moved T257024: Buster elasticsearch-curator version not compatible with ELK7 from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · observability, Operations, Wikimedia-Logstash
lmata moved T261225: Set strict CSP rule on Kibana logstash.wikimedia.org from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · Security, observability, Wikimedia-Logstash
lmata moved T261281: Improve performance of Thanos (+ Prometheus) from Up next to Backlog on the observability board.
Sep 21 2020, 8:26 PM · Patch-For-Review, User-fgiunchedi, observability
lmata added a member for observability: colewhite.
Sep 21 2020, 2:25 PM

Sep 14 2020

lmata moved T262429: illegal_argument_exception from Inbox to Radar on the observability board.
Sep 14 2020, 3:24 PM · Push-Notification-Service, Operations, serviceops, observability, Product-Infrastructure-Team-Backlog
lmata moved T229397: Puppet: get row/rack info from Netbox from Inbox to Radar on the observability board.
Sep 14 2020, 3:23 PM · observability, User-crusnov, User-jbond, Patch-For-Review, Puppet, Operations
lmata added a comment to T262579: Prometheus/MariaDB counts a 'SELECT ... FOR UPDATE' query as an UPDATE query.

Is there any specific action you'd like us to take regarding the exporter?

Sep 14 2020, 3:22 PM · observability, DBA, Operations
lmata moved T262626: Remove http.client_ip from EventGate default schema (again) from Inbox to Radar on the observability board.
Sep 14 2020, 3:20 PM · Analytics-Kanban, Product-Analytics, Patch-For-Review, Product-Infrastructure-Data, observability, Privacy Engineering, Analytics, Event-Platform
lmata added a comment to T262675: Store Kubernetes events for more than one hour.

hello @JMeybohm do you have some guidance as to priority for this task, is it interesting for the next set of weeks? or is this more along the nice to have? We have some thoughts and would like to accommodate this request for planning. Also let us know how you'd like us to support you on this task.

Sep 14 2020, 3:18 PM · Patch-For-Review, observability, Prod-Kubernetes, Kubernetes, serviceops

Sep 9 2020

lmata moved T262202: Create a separate 'mwdebug' cluster from Inbox to Radar on the observability board.

sounds good, will move this to Radar and let me know when/if we can be of assistance :-)

Sep 9 2020, 3:29 PM · Patch-For-Review, Analytics-Radar, Release-Engineering-Team, observability, serviceops, User-jijiki

Sep 8 2020

lmata added a comment to T262202: Create a separate 'mwdebug' cluster.

@jijiki should we add this to our backlog or is this tagged mainly for our viewing benefit? A quick team conversation has determined that we too agree this is a good thing. Let me know if/how we can assist.

Sep 8 2020, 3:23 PM · Patch-For-Review, Analytics-Radar, Release-Engineering-Team, observability, serviceops, User-jijiki
lmata closed T262170: Grafana has failed to load its application files as Declined.

Closing for now, please reopen if this happens again. Thank you!

Sep 8 2020, 3:19 PM · observability

Aug 27 2020

lmata removed a project from T161528: incident 20170323-wikibase did not trigger Icinga paging: observability.

Untagging the Observability project for now as there doesn't seem to be an action item for the team. Please add back if there is anything we missed.

Aug 27 2020, 2:33 PM · serviceops, Icinga, Operations
lmata moved T168403: Aggregate prometheus functions yielding different results in grafana vs. prometheus console from Inbox to Backlog on the observability board.
Aug 27 2020, 2:29 PM · Prometheus-metrics-monitoring, observability, Operations
lmata moved T184714: Puppet fail to properly refresh Icinga from Inbox to Backlog on the observability board.
Aug 27 2020, 2:26 PM · observability, Operations
lmata moved T253733: Icinga stopped sending emails from Inbox to Backlog on the observability board.
Aug 27 2020, 2:26 PM · Icinga, observability
lmata moved T259711: Determine requirements for logging native app client errors to Logstash from Inbox to Radar on the observability board.
Aug 27 2020, 2:25 PM · Product-Infrastructure-Data, observability, Better Use Of Data
lmata moved T259020: varnishmtail silently stops working if varnishncsa crashes from Inbox to Backlog on the observability board.
Aug 27 2020, 2:23 PM · observability, Patch-For-Review, Operations, Traffic
lmata moved T260086: Performance team services health dashboard from Inbox to Radar on the observability board.
Aug 27 2020, 2:22 PM · observability, Performance-Team
lmata moved T260154: De-noise "Ensure local MW versions match expected deployment" alerts from Inbox to Backlog on the observability board.
Aug 27 2020, 2:21 PM · observability
lmata moved T260240: UNIX group 'bird' missing on bird package installation from Inbox to Backlog on the observability board.
Aug 27 2020, 2:21 PM · observability, Cloud-VPS, Operations
lmata moved T260521: icinga-exporter failing on alert hosts from Inbox to In progress on the observability board.
Aug 27 2020, 2:20 PM · Patch-For-Review, observability
lmata moved T260533: Add alert[12]001 to network ACLs from Inbox to Radar on the observability board.
Aug 27 2020, 2:19 PM · fundraising-tech-ops, Operations, netops, observability
lmata moved T260686: check_mariadb_dump failing on alert[12]* hosts from Inbox to Radar on the observability board.
Aug 27 2020, 2:16 PM · DBA, observability
lmata moved T261225: Set strict CSP rule on Kibana logstash.wikimedia.org from Inbox to Up next on the observability board.
Aug 27 2020, 2:15 PM · Security, observability, Wikimedia-Logstash
lmata moved T261274: Figure out switchover steps for mwlog hosts from Inbox to Backlog on the observability board.
Aug 27 2020, 2:15 PM · serviceops, observability, Operations
lmata moved T261281: Improve performance of Thanos (+ Prometheus) from Inbox to Up next on the observability board.
Aug 27 2020, 2:13 PM · Patch-For-Review, User-fgiunchedi, observability
lmata moved T261342: ensure alert[12]001 are prepared for meta monitoring from Inbox to In progress on the observability board.
Aug 27 2020, 2:12 PM · observability

Aug 10 2020

lmata moved T259780: rsyslog occasional segfault on centrallog hosts from Inbox to Backlog on the observability board.
Aug 10 2020, 3:36 PM · User-fgiunchedi, observability, Operations
lmata moved T260053: Reduce Prometheus retention/coverage once Thanos has collected enough data from Inbox to Backlog on the observability board.
Aug 10 2020, 3:35 PM · observability

Jul 6 2020

lmata added a watcher for Operations: lmata.
Jul 6 2020, 4:36 PM
lmata added a member for Operations: lmata.
Jul 6 2020, 4:34 PM
lmata added a member for observability: lmata.
Jul 6 2020, 4:34 PM

Jul 2 2020

lmata added a watcher for ORES: lmata.
Jul 2 2020, 3:40 PM
lmata added a watcher for observability: lmata.
Jul 2 2020, 3:40 PM

Jun 22 2020

lmata added a comment to T254818: Requesting access to PROD for lmata (SRE).

Thank you!

Jun 22 2020, 12:34 PM · Operations, SRE-Access-Requests

Jun 9 2020

lmata added a comment to T254818: Requesting access to PROD for lmata (SRE).

@jbond will generate that new key today, thank you

Jun 9 2020, 1:00 PM · Operations, SRE-Access-Requests

Jun 8 2020

lmata created T254818: Requesting access to PROD for lmata (SRE).
Jun 8 2020, 8:16 PM · Operations, SRE-Access-Requests

May 21 2020

lmata added a comment to T253277: Add LMata to wmf ldap group.

Thank you!

May 21 2020, 2:38 PM · Operations, SRE-Access-Requests, LDAP-Access-Requests
lmata updated the task description for T253277: Add LMata to wmf ldap group.
May 21 2020, 1:46 AM · Operations, SRE-Access-Requests, LDAP-Access-Requests
lmata created T253277: Add LMata to wmf ldap group.
May 21 2020, 1:45 AM · Operations, SRE-Access-Requests, LDAP-Access-Requests