Page MenuHomePhabricator

RLazarus (Reuven Lazarus) (rzl)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Oct 15 2019, 4:02 PM (239 w, 6 d)
Availability
Available
IRC Nick
rzl
LDAP User
RLazarus
MediaWiki User
RLazarus (WMF) [ Global Accounts ]

Recent Activity

Wed, May 1

RLazarus awarded T358636: etcdmirror does not recover from a cleared waitIndex a Barnstar token.
Wed, May 1, 4:12 PM · serviceops

Wed, Apr 24

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Thanks. At present the controller monitors all namespaces, but ignores pods other than in mw-script. So if I were estimating memory usage I'd base it on the total number of pod events in the cluster, not just in the namespace.

Wed, Apr 24, 5:42 PM · MW-on-K8s, serviceops

Apr 17 2024

RLazarus added a comment to T362717: scap should optionally display helmfile diffs for review.

That sounds reasonable! Note for the future that helm diff has a --suppress-output-line-regex which does exactly what you'd like it to do, but it's not available in the version we're currently running.

Apr 17 2024, 10:42 PM · serviceops, Release-Engineering-Team, Scap
RLazarus closed T57857: Unit tests for apache config/rewrites as Resolved.
Apr 17 2024, 7:54 PM · Wikimedia-Apache-configuration

Apr 16 2024

RLazarus added a comment to T362717: scap should optionally display helmfile diffs for review.

If we're really worried about that race condition, is it plausible to do this?

Apr 16 2024, 9:56 PM · serviceops, Release-Engineering-Team, Scap

Apr 4 2024

RLazarus triaged T361860: Old "Email this user" email is repeatedly resent as High priority.

Clinic duty SRE here -- I/F, can you start investigating this at the MTA end? Triaging this to High in case it's widespread, but feel free to decrease if it turns out it's not.

Apr 4 2024, 6:19 PM · Mail, Infrastructure-Foundations, MediaWiki-Email, SRE
RLazarus closed T361798: Grant Access to <LDAP/wmf> for <ospingou> as Resolved.
rzl@mwmaint1002:~$ ldapsearch -x cn=wmf | grep ospingou
member: uid=ospingou,ou=people,dc=wikimedia,dc=org
Apr 4 2024, 5:27 PM · Patch-For-Review, SRE, LDAP-Access-Requests
RLazarus added a member for WMF-NDA: Ospingou.
Apr 4 2024, 5:19 PM
RLazarus claimed T361742: Requesting access to shell access to analytics client servers for AndyRussG.
Apr 4 2024, 1:12 AM · Patch-For-Review, SRE, SRE-Access-Requests
RLazarus updated the task description for T361742: Requesting access to shell access to analytics client servers for AndyRussG.
Apr 4 2024, 1:12 AM · Patch-For-Review, SRE, SRE-Access-Requests

Apr 3 2024

RLazarus claimed T361665: Grant Access to wmf for AndyRussG.
Apr 3 2024, 6:15 PM · Patch-For-Review, SRE, LDAP-Access-Requests
RLazarus updated subscribers of T361665: Grant Access to wmf for AndyRussG.

@AndyRussG Welcome back!

Apr 3 2024, 6:15 PM · Patch-For-Review, SRE, LDAP-Access-Requests

Apr 1 2024

RLazarus closed T361420: [CVE-2024-3094] SSH backdoor vulnerability in liblzma in Debian Sid as Resolved.

Yep, we're following closely -- but we don't use Debian unstable, so we're not directly affected. Thanks for checking!

Apr 1 2024, 8:06 PM · Vuln-VulnComponent, SecTeam-Processed, SRE, Security
RLazarus updated subscribers of T361266: Offboard Michael Grosse (WMDE) from WMF systems.

Clinic duty SRE here, thanks @karapayneWMDE for the ticket. I merged https://gerrit.wikimedia.org/r/1015995 (thanks @Dzahn!) and followed up with

Apr 1 2024, 7:34 PM · SRE, LDAP-Access-Requests
RLazarus updated the task description for T361266: Offboard Michael Grosse (WMDE) from WMF systems.
Apr 1 2024, 7:31 PM · SRE, LDAP-Access-Requests

Mar 27 2024

RLazarus added a comment to T360867: httpbb appserver test breaks deployment of the week due to a timeout parsing page.

The reason we're catching it now, and not with the regular httpbb checks being done on the cumin nodes is that the scap httpbb checks are being run without --retry_on_timeout.

Mar 27 2024, 3:35 PM · Patch-For-Review, serviceops, Release-Engineering-Team, Deployments

Mar 22 2024

RLazarus added a comment to T360652: imagecatalog_record.service fails due to read-only sqlite database.

Curious: As @Clement_Goubert and I discussed, both the directory (via puppet file) and the database file (via puppet exec of imagecatalog init) have the right user, imagecatalog. There's nothing in Puppet (like a recurse) to ensure ownership on the database file, but it still ought to come out correct, as far as I can tell. Claime reports the file was owned by mwbuilder, which runs the release tools, but I think imagecatalog init should still have run first as the imagecatalog user.

Mar 22 2024, 12:36 AM · Datacenter-Switchover, serviceops

Mar 11 2024

RLazarus added a comment to T357547: ☂️ Northward Datacentre Switchover (March 2024) .

FYI: I redid a dry run and live test for 01-stop-maintenance.py after https://gerrit.wikimedia.org/r/1008583 and it's good to go.

Mar 11 2024, 9:40 PM · Patch-For-Review, Datacenter-Switchover, Data-Persistence, SRE Observability (FY2023/2024-Q3), collaboration-services, observability, serviceops, DC-Ops, Traffic
RLazarus removed a subtask for T357547: ☂️ Northward Datacentre Switchover (March 2024) : T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s.
Mar 11 2024, 9:36 PM · Patch-For-Review, Datacenter-Switchover, Data-Persistence, SRE Observability (FY2023/2024-Q3), collaboration-services, observability, serviceops, DC-Ops, Traffic
RLazarus removed a parent task for T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s: T357547: ☂️ Northward Datacentre Switchover (March 2024) .
Mar 11 2024, 9:36 PM · Datacenter-Switchover, serviceops, MW-on-K8s
RLazarus added a comment to T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s.

This is good to go for the March 2024 switchover, so removing it as a subtask.

Mar 11 2024, 9:35 PM · Datacenter-Switchover, serviceops, MW-on-K8s
RLazarus updated the task description for T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s.
Mar 11 2024, 9:35 PM · Datacenter-Switchover, serviceops, MW-on-K8s

Mar 7 2024

RLazarus added a comment to T359597: db2124 depooled with index corruption.

I don't see anything obviously hardware-broken in logs. I notice it was just repooled yesterday after maintenance for T352010, but nothing jumps out as an obvious cause. Over to the DBAs from here, enjoy. :)

Mar 7 2024, 10:34 PM · DBA
RLazarus triaged T359597: db2124 depooled with index corruption as High priority.
Mar 7 2024, 10:21 PM · DBA
RLazarus created T359597: db2124 depooled with index corruption.
Mar 7 2024, 10:15 PM · DBA
RLazarus edited projects for T359583: Provide a way to get sampled POST body logs, added: Sustainability (Incident Followup); removed Sustainability.
Mar 7 2024, 8:17 PM · MW-Interfaces-Team, Sustainability (Incident Followup), Observability-Logging
RLazarus added projects to T359583: Provide a way to get sampled POST body logs: Sustainability, MediaWiki-Engineering, Observability-Logging.
Mar 7 2024, 8:17 PM · MW-Interfaces-Team, Sustainability (Incident Followup), Observability-Logging

Mar 5 2024

RLazarus added a comment to T359129: Requesting GitLab account activation for jasmine_.

Thanks @bd808!

Mar 5 2024, 4:34 PM · User-bd808, GitLab (Account Approval), Release-Engineering-Team
RLazarus added a comment to T359129: Requesting GitLab account activation for jasmine_.

That sounds like a perfect solution, except that I'm not in Trusted-Contributors. Which is admittedly pretty funny.

Mar 5 2024, 4:28 PM · User-bd808, GitLab (Account Approval), Release-Engineering-Team
RLazarus added a comment to T359129: Requesting GitLab account activation for jasmine_.

Confirming @jasmine_ is an intern on my team. If she needs any vouching, please consider her vouched! :)

Mar 5 2024, 1:49 AM · User-bd808, GitLab (Account Approval), Release-Engineering-Team
RLazarus renamed T359129: Requesting GitLab account activation for jasmine_ from Requesting GitLab account activation for USER[S] to Requesting GitLab account activation for jasmine_.
Mar 5 2024, 1:49 AM · User-bd808, GitLab (Account Approval), Release-Engineering-Team
RLazarus added a subtask for T357547: ☂️ Northward Datacentre Switchover (March 2024) : T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s.
Mar 5 2024, 1:46 AM · Patch-For-Review, Datacenter-Switchover, Data-Persistence, SRE Observability (FY2023/2024-Q3), collaboration-services, observability, serviceops, DC-Ops, Traffic
RLazarus added a parent task for T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s: T357547: ☂️ Northward Datacentre Switchover (March 2024) .
Mar 5 2024, 1:46 AM · Datacenter-Switchover, serviceops, MW-on-K8s
RLazarus created T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s.
Mar 5 2024, 1:46 AM · Datacenter-Switchover, serviceops, MW-on-K8s
RLazarus created T359127: MW image version for maintenance scripts.
Mar 5 2024, 12:58 AM · MW-on-K8s, serviceops

Mar 4 2024

RLazarus closed T358361: Update privileges or remove os-installers group as Resolved.
Mar 4 2024, 8:15 PM · Infrastructure-Foundations
RLazarus renamed T358936: Kubernetes apiserver probe failures on restart from Kubernetes apiserver probe failures on resatrt to Kubernetes apiserver probe failures on restart.
Mar 4 2024, 5:08 PM · Prod-Kubernetes, serviceops, SRE

Mar 2 2024

RLazarus triaged T358936: Kubernetes apiserver probe failures on restart as Medium priority.
Mar 2 2024, 12:25 AM · Prod-Kubernetes, serviceops, SRE

Mar 1 2024

RLazarus added a comment to T358825: Fix requestctl naming collision on "sites".

Will do, thanks for the pointer.

Mar 1 2024, 3:52 PM · Traffic, conftool
RLazarus triaged T358825: Fix requestctl naming collision on "sites" as Medium priority.
Mar 1 2024, 12:46 AM · Traffic, conftool

Feb 29 2024

RLazarus claimed T357595: Investigate restricting match pattern on /wiki RewriteRule.
Feb 29 2024, 8:31 PM · Patch-For-Review, Wikimedia-Apache-configuration, serviceops

Feb 28 2024

RLazarus reopened T358361: Update privileges or remove os-installers group as "In Progress".
Feb 28 2024, 6:42 PM · Infrastructure-Foundations
RLazarus closed T358361: Update privileges or remove os-installers group as Resolved.
Feb 28 2024, 4:45 PM · Infrastructure-Foundations

Feb 23 2024

RLazarus triaged T358361: Update privileges or remove os-installers group as Low priority.
Feb 23 2024, 5:11 PM · Infrastructure-Foundations

Feb 22 2024

RLazarus closed T358020: Not receiving posts or moderation messages as Resolved.

Restarted mailman3 at 00:43, icinga alerts are cleared, and the graph in T358020#9565952 is trending down again. Thanks @JJMC89 for the report and thanks @Legoktm for the ping.

Feb 22 2024, 12:48 AM · Wikimedia-Incident, SRE, Wikimedia-Mailing-lists

Feb 21 2024

bd808 awarded T345868: Rename the shellbox service to shellbox-score a Like token.
Feb 21 2024, 9:57 PM · Shellbox, serviceops

Feb 14 2024

RLazarus added a comment to T357595: Investigate restricting match pattern on /wiki RewriteRule.

the intention was probably for this to match something a bit more restrictive (e.g., matching ^/wiki(/.*)?$)

Feb 14 2024, 11:56 PM · Patch-For-Review, Wikimedia-Apache-configuration, serviceops
RLazarus edited projects for T357436: Request donatewiki redirect, added: Wikimedia-Apache-configuration; removed Domains.
Feb 14 2024, 7:32 PM · fundraising-tech-ops, Wikimedia-Apache-configuration, serviceops, Fundraising-Backlog, SRE
RLazarus added a comment to T357436: Request donatewiki redirect.

Sorry yeah, I was using the term broadly. The goal is to edit the Apache config, but that hieradata file is how you'd do it. :)

Feb 14 2024, 7:12 PM · fundraising-tech-ops, Wikimedia-Apache-configuration, serviceops, Fundraising-Backlog, SRE

Feb 13 2024

RLazarus added a project to T357436: Request donatewiki redirect: serviceops.

Hi from Service Ops SRE!

Feb 13 2024, 7:38 PM · fundraising-tech-ops, Wikimedia-Apache-configuration, serviceops, Fundraising-Backlog, SRE

Feb 12 2024

RLazarus added a comment to T341553: Allow running one-off scripts manually.

Surfacing @JMeybohm's reasonable concern from https://gerrit.wikimedia.org/r/c/988851/comments/3827b6cd_15427748:

Feb 12 2024, 1:35 AM · MW-on-K8s, serviceops

Jan 25 2024

RLazarus awarded T355912: Grant Access to ops for swfrench a Party Time token.
Jan 25 2024, 11:31 PM · SRE, LDAP-Access-Requests
RLazarus updated subscribers of T355606: Requesting analytics-privatedata-users access for amastilovic.

This week's clinic duty SRE is @Arnoldokoth.

Jan 25 2024, 6:49 PM · Patch-For-Review, SRE, SRE-Access-Requests

Jan 24 2024

RLazarus updated the task description for T355834: Requesting access to (general SRE production SSH access) for swfrench.
Jan 24 2024, 11:43 PM · SRE, SRE-Access-Requests

Jan 23 2024

RLazarus added a member for acl*sre-team: Scott_French.
Jan 23 2024, 12:30 AM
RLazarus added a member for WMF-NDA: Scott_French.
Jan 23 2024, 12:19 AM

Jan 9 2024

Ladsgroup awarded T299989: Pairing tool for new SREs using sudo under supervision a Love token.
Jan 9 2024, 4:51 PM · User-MoritzMuehlenhoff, SRE-tools, Infrastructure-Foundations, SRE
RLazarus updated subscribers of T341553: Allow running one-off scripts manually.

@Joe @JMeybohm That's a lot of code review at once, across two tasks -- I posted it all for context, but no expectation you'll have time to look at all of it immediately. (It's all tested together from my homedir on deploy2002, and works.) Here's the suggested reviewing order:

Jan 9 2024, 7:42 AM · MW-on-K8s, serviceops

Dec 19 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Okay, let me know if https://gerrit.wikimedia.org/r/983963 plus the most recent iteration of https://gitlab.wikimedia.org/repos/sre/k8s-controller-sidecars/-/merge_requests/1 is what you had in mind...

Dec 19 2023, 3:28 AM · MW-on-K8s, serviceops

Dec 18 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Oh, I misunderstood what you meant by "enable the controller on a per namespace level" above! I thought deploying one instance per namespace was what you had in mind.

Dec 18 2023, 4:58 PM · MW-on-K8s, serviceops

Dec 14 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Yeah, as foretold:

Dec 14 2023, 10:38 PM · MW-on-K8s, serviceops

Dec 11 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

I reckon this technique will be useful for all charts that need a Job object.

Dec 11 2023, 5:54 PM · MW-on-K8s, serviceops
RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Super helpful explanation, thank you! https://gerrit.wikimedia.org/r/981703 should do the above, and https://gerrit.wikimedia.org/r/981704 adds the binding for the mw-script namespace. I'll deploy those like wikitech:Kubernetes/Add_a_new_service#Deploy_changes_to_helmfile.d/admin_ng and that will let me finish experimenting with the helm charts for both the sidecar controller and the actual jobs.

Dec 11 2023, 2:46 AM · MW-on-K8s, serviceops

Dec 7 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

@JMeybohm Can you help with an RBAC issue?

Dec 7 2023, 9:51 PM · MW-on-K8s, serviceops

Oct 26 2023

RLazarus updated the task description for T348284: Handle sidecar containers in one-off Kubernetes jobs.
Oct 26 2023, 6:49 PM · MW-on-K8s, serviceops
RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Yep, sounds like a similar fit.

Oct 26 2023, 6:49 PM · MW-on-K8s, serviceops

Oct 16 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Doh. Okay, that's a good reason not to go that route. :) I'll give the "primary container" label a try, thanks.

Oct 16 2023, 3:26 PM · MW-on-K8s, serviceops

Oct 14 2023

RLazarus added a comment to T348284: Handle sidecar containers in one-off Kubernetes jobs.

Both good points, thanks.

Oct 14 2023, 9:58 PM · MW-on-K8s, serviceops

Oct 6 2023

RLazarus updated the task description for T348284: Handle sidecar containers in one-off Kubernetes jobs.
Oct 6 2023, 3:50 PM · MW-on-K8s, serviceops

Oct 5 2023

RLazarus triaged T348284: Handle sidecar containers in one-off Kubernetes jobs as Medium priority.
Oct 5 2023, 10:37 PM · MW-on-K8s, serviceops
RLazarus created T348284: Handle sidecar containers in one-off Kubernetes jobs.
Oct 5 2023, 10:37 PM · MW-on-K8s, serviceops

Sep 21 2023

RLazarus closed T345959: Requesting access to analytics-privatedata-users for ahoelzl as Resolved.

Done:

  • Added you to the wmf LDAP group.
  • Added you to the WMF-NDA Phabricator project.
  • Created your shell user ahoelzl and added it to the analytics-privatedata-users POSIX group.
  • Created your Kerberos principal.
Sep 21 2023, 7:31 PM · SRE, SRE-Access-Requests
RLazarus added a member for WMF-NDA: Ahoelzl.
Sep 21 2023, 7:09 PM
RLazarus updated the task description for T345959: Requesting access to analytics-privatedata-users for ahoelzl.
Sep 21 2023, 7:02 PM · SRE, SRE-Access-Requests
RLazarus added a comment to T345959: Requesting access to analytics-privatedata-users for ahoelzl.

It seems like this fell through the cracks between last week's SRE clinic duty (mine) and this week's. Let me finish it up for you, thanks for your patience.

Sep 21 2023, 7:02 PM · SRE, SRE-Access-Requests

Sep 20 2023

RLazarus added a comment to T346144: Hardcode the SLO time windows in Grafana dashboards generated via Grizzly.

This sounds right to me -- thanks @elukey for getting it rolling. Early on, we had talked about autogenerating links for different calendar quarters and adding them to the text panel on top, but my recollection is we decided to spend that energy on Pyrra instead.

Sep 20 2023, 11:52 PM · SRE Observability (FY2023/2024-Q1), serviceops, observability

Sep 18 2023

RLazarus claimed T289022: Investigate and restore K.A.Z httpbb test.
Sep 18 2023, 6:45 PM · Wikimedia-Apache-configuration, serviceops, SRE
RLazarus placed T289022: Investigate and restore K.A.Z httpbb test up for grabs.
Sep 18 2023, 6:45 PM · Wikimedia-Apache-configuration, serviceops, SRE
RLazarus claimed T289202: Run httpbb periodically.
Sep 18 2023, 6:45 PM · serviceops, SRE
RLazarus placed T289202: Run httpbb periodically up for grabs.
Sep 18 2023, 6:45 PM · serviceops, SRE

Sep 16 2023

RLazarus added a project to T346371: Delete MediaWiki.*.growthexperiments.taskcount.link_recommendation.* from Graphite: Observability-Metrics.
Sep 16 2023, 7:22 PM · Observability-Metrics, SRE, Growth-Team, Grafana
RLazarus added a comment to T328746: Require 2FA for members of acl*sre-team.

I don't have edit access to acl*security.

Sep 16 2023, 7:16 PM · SecTeam-Processed, SRE, Vuln-MissingAuthz, Phabricator, Security, Security-Team
RLazarus changed the status of T342535: Requesting access to analytics_privatedata_users, deployment_members for Mabualruz from Stalled to Open.

@thcipriani Sorry for the back-and-forth, but just because it isn't 100% explicit from reading this task -- did you want @Mabualruz to get deployer training before being added to the group? Or do we have your approval to add him, so that he can do the training hands-on?

Sep 16 2023, 7:10 PM · SRE, SRE-Access-Requests
RLazarus changed the status of T342535: Requesting access to analytics_privatedata_users, deployment_members for Mabualruz, a subtask of T345186: Deployment training request for mabualruz, from Stalled to Open.
Sep 16 2023, 7:10 PM · Release-Engineering-Team (Deployment Training Requests)
RLazarus updated the task description for T345959: Requesting access to analytics-privatedata-users for ahoelzl.
Sep 16 2023, 6:48 PM · SRE, SRE-Access-Requests
RLazarus added a comment to T345959: Requesting access to analytics-privatedata-users for ahoelzl.

@RLazarus Here is the new production (non-shared) public key:
[...]

Sep 16 2023, 6:48 PM · SRE, SRE-Access-Requests

Sep 15 2023

RLazarus updated subscribers of T346151: Lift Wing alerting.

We have some plans for SLO-based alerting in the pipeline, but nothing implemented yet.

Sep 15 2023, 6:37 PM · Observability-Alerting, Machine-Learning-Team

Sep 14 2023

RLazarus claimed T341553: Allow running one-off scripts manually.
Sep 14 2023, 12:40 AM · MW-on-K8s, serviceops

Sep 12 2023

RLazarus added a comment to T305863: Denial of Service due to repeated hits from a particular IP.

No, I tagged it private when we asked for PII, so that it would already be private when that stuff was posted. Since it never appeared, I'm fine with opening it up.

Sep 12 2023, 8:45 PM · SecTeam-Processed, Security-Team, Security, Traffic, SRE

Sep 11 2023

RLazarus updated the task description for T345959: Requesting access to analytics-privatedata-users for ahoelzl.
Sep 11 2023, 10:03 PM · SRE, SRE-Access-Requests
RLazarus added a comment to T345959: Requesting access to analytics-privatedata-users for ahoelzl.

Hi @Ahoelzl, welcome to the Foundation! SRE here, I'll be able to set you up with production access.

Sep 11 2023, 10:03 PM · SRE, SRE-Access-Requests
RLazarus assigned T345726: Requesting Creation of a new POSIX group and system user for the Analytics WMDE team. to joanna_borun.

Hi @joanna_borun -- does this need Infrastructure Foundations approval?

Sep 11 2023, 9:37 PM · Data-Platform-SRE, SRE, SRE-Access-Requests
RLazarus updated subscribers of T345868: Rename the shellbox service to shellbox-score.

I propose using a _shellbox_common_ directory like we have a _aqs2-common_ and a _mediawiki-common_ directory in helmfile.d/services and symlink from there.

Sep 11 2023, 8:45 PM · Shellbox, serviceops

Sep 7 2023

RLazarus triaged T345868: Rename the shellbox service to shellbox-score as Low priority.
Sep 7 2023, 4:14 PM · Shellbox, serviceops

Aug 21 2023

RLazarus added a comment to T343377: Grant slightly broader access to Klaxon.

If it's just a matter of managing a LDAP group, then that's perfectly within scope of the IDM. It's one of the features that already have a first-attempt implementation.

Aug 21 2023, 6:21 PM · Sustainability (Incident Followup), Incident Tooling, SRE-OnFire, SRE

Aug 18 2023

RLazarus added a comment to T343377: Grant slightly broader access to Klaxon.

Email doesn't seem like a great way to communicate for page worthy incidents, would it be possible to insist on IRC and have them input their handle as part of the Klaxon flow, since we already recommend they chat with us in #wikimedia-sre?

Aug 18 2023, 9:11 PM · Sustainability (Incident Followup), Incident Tooling, SRE-OnFire, SRE
RLazarus updated subscribers of T343377: Grant slightly broader access to Klaxon.

One issue that I raised, but perhaps was not captured anywhere is adding some guidance to the documentation on how the folks being paged can communicate with person who used Klaxon. For instance should I assume the person using Klaxon is on IRC and their is a channel we can chat in about the incident?

Aug 18 2023, 8:38 PM · Sustainability (Incident Followup), Incident Tooling, SRE-OnFire, SRE
RLazarus added a comment to T343377: Grant slightly broader access to Klaxon.

Only two blockers were raised at the August 7 SRE meeting:

Aug 18 2023, 7:04 PM · Sustainability (Incident Followup), Incident Tooling, SRE-OnFire, SRE

Aug 2 2023

RLazarus created T343377: Grant slightly broader access to Klaxon.
Aug 2 2023, 8:11 PM · Sustainability (Incident Followup), Incident Tooling, SRE-OnFire, SRE

Jul 25 2023

RLazarus updated subscribers of T341122: Implement daily data update routine.

Those numbers don't immediately raise alarm bells for me -- "storage" doesn't mean anything persistent, only ephemeral data that can disappear when the script exits, right? As long as that's the case (and assuming you're using ~1 CPU), you should be fine. I'm tagging in @akosiaris to confirm the resource request is sensible.

Jul 25 2023, 8:48 PM · Trust and Safety Product Sprint (Sprint Bodhrán), Patch-For-Review, Anti-Harassment (AHaT Sprint 32 - Baseball Cap), iPoid-Service