Page MenuHomePhabricator

MLechvien-WMF (Matthieu Lec'hvien)
User

Today

  • No visible events.

Tomorrow

  • No visible events.

Sunday

  • No visible events.

User Details

User Since
Nov 10 2025, 2:20 PM (10 w, 4 d)
Availability
Available
LDAP User
Unknown
MediaWiki User
MLechvien-WMF [ Global Accounts ]

Recent Activity

Today

MLechvien-WMF closed T415352: Add serviceops new to the monitored channels on IRC bot/wikibugs as Resolved.
Fri, Jan 23, 6:46 PM · Wikibugs, ServiceOps new
MLechvien-WMF added a comment to T415352: Add serviceops new to the monitored channels on IRC bot/wikibugs.

I modified the alternate tag of serviceops new and I believe this should now be reflected by IRC bot/wikibugs to the -operations channel

Fri, Jan 23, 6:45 PM · Wikibugs, ServiceOps new
MLechvien-WMF added a hashtag to ServiceOps new: #sre-serviceops-new.
Fri, Jan 23, 6:30 PM
MLechvien-WMF added a project to T385404: Deploy LilyPond 2.24 with Cairo support to shellbox containers: ServiceOps-Upgrades-Hardware.

We don't have capacity this quarter, we will reassess as part of the general Debian upgrades of next quarter

Fri, Jan 23, 6:16 PM · ServiceOps-Upgrades-Hardware, ServiceOps-Services-Oids, Shellbox, ServiceOps new, Upstream, Wikimedia-SVG-rendering, MediaWiki-extensions-Score
MLechvien-WMF edited projects for T303744: Keep track of teams responsible for namespaces inside kubernetes, added: Serviceops-easywins; removed ServiceOps-good-first-task.
Fri, Jan 23, 6:06 PM · Serviceops-easywins, ServiceOps new, Prod-Kubernetes
MLechvien-WMF created Serviceops-easywins.
Fri, Jan 23, 6:03 PM
MLechvien-WMF moved T414112: Deploy instance of hoarde as linked-artifacts(?) in k8s from Inbox to Radar on the ServiceOps new board.
Fri, Jan 23, 5:55 PM · ServiceOps-Services-Oids, ServiceOps new, User-Eevans, Patch-For-Review, Data-Persistence
MLechvien-WMF added a comment to T414112: Deploy instance of hoarde as linked-artifacts(?) in k8s.

We're at capacity this quarter, but we'll keep an eye on this, please tag us if we can help with design questions or issues during deployment.

Fri, Jan 23, 5:54 PM · ServiceOps-Services-Oids, ServiceOps new, User-Eevans, Patch-For-Review, Data-Persistence
MLechvien-WMF moved T303744: Keep track of teams responsible for namespaces inside kubernetes from Inbox to Backlog on the ServiceOps new board.
Fri, Jan 23, 1:51 PM · Serviceops-easywins, ServiceOps new, Prod-Kubernetes
MLechvien-WMF raised the priority of T303744: Keep track of teams responsible for namespaces inside kubernetes from Low to Medium.

Raising the priority as this got mentioned in various places. We don't have the capacity to take that on this quarter so moving it to backlog.

Fri, Jan 23, 1:50 PM · Serviceops-easywins, ServiceOps new, Prod-Kubernetes
MLechvien-WMF removed projects from T253058: DRY kafka broker declaration in helmfiles: ServiceOps-good-first-task, ServiceOps new.

Removing our tag, please add it back if anything is needed from our end

Fri, Jan 23, 1:42 PM · ServiceOps-Datastores, Data-Engineering, Data-Platform-SRE, SRE, Event-Platform
MLechvien-WMF added a comment to T412941: Proposal: scap deploy-service.

Release-Engineering-Team could you please provide inputs on the Scap specific works in the description?

Fri, Jan 23, 1:37 PM · Epic, ServiceOps new, Scap, Release-Engineering-Team
MLechvien-WMF removed a project from T414665: Proof of Concept: SquareOne CDN Dashboards: ServiceOps new.

Removing serviceops tag until the parent story gets scoped and we decide who is collaborating on it (to be discussed over coming weeks)

Fri, Jan 23, 1:08 PM · Incident Tooling, Traffic
MLechvien-WMF edited projects for T414663: SquareOne Dashboards: Guided Incident Response, added: SRE Observability; removed observability.
Fri, Jan 23, 12:55 PM · SRE Observability, Epic, ServiceOps new
MLechvien-WMF moved T414663: SquareOne Dashboards: Guided Incident Response from Inbox to Needs Info / Blocked on the ServiceOps new board.
Fri, Jan 23, 12:53 PM · SRE Observability, Epic, ServiceOps new
MLechvien-WMF closed T414967: mw-jobrunner curl errors when talking to other services as Declined.

Declining this task, given our limited capacity it's better to wait until the ongoing work retires the culprit job

Fri, Jan 23, 12:52 PM · Wikimedia-production-error, ServiceOps new
MLechvien-WMF triaged T415352: Add serviceops new to the monitored channels on IRC bot/wikibugs as Medium priority.
Fri, Jan 23, 11:42 AM · Wikibugs, ServiceOps new
MLechvien-WMF created T415352: Add serviceops new to the monitored channels on IRC bot/wikibugs.
Fri, Jan 23, 11:41 AM · Wikibugs, ServiceOps new
MLechvien-WMF changed the subtype of T414967: mw-jobrunner curl errors when talking to other services from "Task" to "Production Error".

Volume seems very high, I'm surprised this does not fire a more visible alert.

Fri, Jan 23, 10:20 AM · Wikimedia-production-error, ServiceOps new
MLechvien-WMF moved T410296: Significant increase in wikifeeds latency and mobileapps error rate since 2025/11/13 from Inbox to In Progress on the ServiceOps new board.
Fri, Jan 23, 10:02 AM · Wikimedia-production-error, ServiceOps new, Wikipedia-Android-App-Backlog, Content-Transform-Team, Wikifeeds
MLechvien-WMF triaged T410296: Significant increase in wikifeeds latency and mobileapps error rate since 2025/11/13 as Medium priority.
Fri, Jan 23, 10:01 AM · Wikimedia-production-error, ServiceOps new, Wikipedia-Android-App-Backlog, Content-Transform-Team, Wikifeeds

Yesterday

MLechvien-WMF closed T406212: charlie wiped cluster redeployment use-case, a subtask of T405703: Update wikikube eqiad to kubernetes 1.31, as Resolved.
Thu, Jan 22, 5:50 PM · Discovery-Search (2025.09.26 - 2025.10.17), Data-Platform-SRE (2025.09.26 - 2025.10.17), Patch-For-Review, collaboration-services, Kubernetes, Prod-Kubernetes, serviceops
MLechvien-WMF closed T406212: charlie wiped cluster redeployment use-case as Resolved.
Thu, Jan 22, 5:50 PM · ServiceOps new, Kubernetes, Prod-Kubernetes
MLechvien-WMF added a comment to T402435: Jobs sometimes disappear without a trace (except "Exec error in cpjobqueue" / "Error: socket hang up" from change-propagation service).

Serviceops backlog triaging here, @Tgr could you confirm if we can close this task?

Thu, Jan 22, 5:48 PM · ServiceOps new, ServiceOps-Mediawiki, MW-Interfaces-Team, MediaWiki-Platform-Team (Radar), WMF-JobQueue
MLechvien-WMF moved T415029: Library restart detection is very slow in Kubernetes workers from Inbox to Backlog on the ServiceOps new board.
Thu, Jan 22, 5:41 PM · Infrastructure-Foundations, ServiceOps new
MLechvien-WMF moved T388799: php-wmerrors rsyslog rule selects on php7 only from Inbox to Backlog on the ServiceOps new board.
Thu, Jan 22, 5:40 PM · ServiceOps-Mediawiki, ServiceOps new
MLechvien-WMF moved T371988: mc-misc100[12] implementation tracking from Inbox to Needs Info / Blocked on the ServiceOps new board.
Thu, Jan 22, 5:40 PM · ServiceOps-Upgrades-Hardware, ServiceOps new
MLechvien-WMF updated subscribers of T371988: mc-misc100[12] implementation tracking.

@jijiki same for this one can you clarify the rationale and urgency?

Thu, Jan 22, 5:40 PM · ServiceOps-Upgrades-Hardware, ServiceOps new
MLechvien-WMF moved T372802: mc-misc200[12] implementation tracking from Inbox to Needs Info / Blocked on the ServiceOps new board.
Thu, Jan 22, 5:39 PM · ServiceOps-Upgrades-Hardware, ServiceOps new
MLechvien-WMF added a comment to T372802: mc-misc200[12] implementation tracking.

@jijiki can you clarify the rationale for this? we should assess if it's needed for this quarter or not

Thu, Jan 22, 5:39 PM · ServiceOps-Upgrades-Hardware, ServiceOps new
MLechvien-WMF assigned T414427: Increase capacity for Mercurius webvideoTranscode job (1080p) processing to Raine.

Assigning to Raine to take a look

Thu, Jan 22, 5:37 PM · ServiceOps new, SRE, TimedMediaHandler-Transcode
MLechvien-WMF moved T414427: Increase capacity for Mercurius webvideoTranscode job (1080p) processing from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 5:36 PM · ServiceOps new, SRE, TimedMediaHandler-Transcode
MLechvien-WMF raised the priority of T414427: Increase capacity for Mercurius webvideoTranscode job (1080p) processing from Medium to High.
Thu, Jan 22, 5:35 PM · ServiceOps new, SRE, TimedMediaHandler-Transcode
MLechvien-WMF moved T353511: Migrate memcached servers to PKI from Inbox to Backlog on the ServiceOps new board.
Thu, Jan 22, 5:31 PM · ServiceOps-Datastores, ServiceOps new
MLechvien-WMF moved T390517: Remove recommendation-api from the REST API offerings from Inbox to Backlog on the ServiceOps new board.
Thu, Jan 22, 5:26 PM · ServiceOps-SharedInfra, ServiceOps new, API Platform (RESTBase Deprecation Roadmap)
MLechvien-WMF reassigned T390517: Remove recommendation-api from the REST API offerings from akosiaris to Clement_Goubert.
Thu, Jan 22, 5:26 PM · ServiceOps-SharedInfra, ServiceOps new, API Platform (RESTBase Deprecation Roadmap)
MLechvien-WMF moved T390861: wikikube-ctrl200[4-5] implementation tracking from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 5:23 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review
MLechvien-WMF raised the priority of T390861: wikikube-ctrl200[4-5] implementation tracking from Medium to High.
Thu, Jan 22, 5:21 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review
MLechvien-WMF moved T411780: hcaptcha-proxy: create logstash dashboard from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 5:20 PM · ServiceOps-Services-Oids, ServiceOps new
MLechvien-WMF raised the priority of T353511: Migrate memcached servers to PKI from Medium to High.
Thu, Jan 22, 5:19 PM · ServiceOps-Datastores, ServiceOps new
MLechvien-WMF added a comment to T353511: Migrate memcached servers to PKI.

@jijiki are you able to take that in your scheduled work for the quarter? If not let's move it to backlog for this quarter

Thu, Jan 22, 5:19 PM · ServiceOps-Datastores, ServiceOps new
MLechvien-WMF moved T356885: Update app.job module in deployment-charts from Inbox to Backlog on the ServiceOps new board.
Thu, Jan 22, 5:16 PM · Prod-Kubernetes, ServiceOps new
MLechvien-WMF moved T397618: Fix thumbor discovery records and make swift use them from Inbox to Needs Info / Blocked on the ServiceOps new board.

Actually @hnowlan I see CR submitted, did you complete that?

Thu, Jan 22, 3:43 PM · Thumbor, SRE-swift-storage, Data-Persistence, ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF edited projects for T397618: Fix thumbor discovery records and make swift use them, added: ServiceOps new, ServiceOps-Upgrades-Hardware; removed serviceops.
Thu, Jan 22, 3:41 PM · Thumbor, SRE-swift-storage, Data-Persistence, ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF added a comment to T397618: Fix thumbor discovery records and make swift use them.

@JMeybohm @Clement_Goubert this sounds like something we may need to do before next Kubernetes upgrade (or at least surfacing it so you know about that special case).

Thu, Jan 22, 3:41 PM · Thumbor, SRE-swift-storage, Data-Persistence, ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF removed a project from T356885: Update app.job module in deployment-charts: serviceops.
Thu, Jan 22, 3:37 PM · Prod-Kubernetes, ServiceOps new
MLechvien-WMF added a comment to T356885: Update app.job module in deployment-charts.

@jijiki does this need to be scheduled this quarter and why? I'm inclined to move it to Backlog until next quarter

Thu, Jan 22, 3:37 PM · Prod-Kubernetes, ServiceOps new
MLechvien-WMF moved T388969: MW deployments shouldn't need a hard-coded kubernetesVersion from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 3:30 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF triaged T385798: Reconsider Thumbor SSIM tests as High priority.
Thu, Jan 22, 3:24 PM · ServiceOps-Services-Oids, ServiceOps-good-first-task, ServiceOps new, Thumbor
MLechvien-WMF edited projects for T388969: MW deployments shouldn't need a hard-coded kubernetesVersion, added: ServiceOps new, ServiceOps-Upgrades-Hardware; removed serviceops.
Thu, Jan 22, 2:17 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF raised the priority of T388969: MW deployments shouldn't need a hard-coded kubernetesVersion from Medium to High.
Thu, Jan 22, 2:13 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, Patch-For-Review, Kubernetes, Prod-Kubernetes
MLechvien-WMF added a project to T400130: Central REST gateway for APIs: ServiceOps-SharedInfra.
Thu, Jan 22, 1:59 PM · ServiceOps-SharedInfra, ServiceOps new, MW-Interfaces-Team (MWI-Roadmap), Epic, OKR-Work
MLechvien-WMF added a project to T399291: Epic: API Rate Limiting Architecture: ServiceOps-SharedInfra.
Thu, Jan 22, 1:58 PM · ServiceOps-SharedInfra, ServiceOps new, MediaWiki-Platform-Team (Radar), Traffic, Epic, OKR-Work, MW-Interfaces-Team, FY2025-26 KR 5.1
MLechvien-WMF moved T400871: Reimage sretest2009 as a wikikube worker and assess performance from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 1:57 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, SRE, DC-Ops
MLechvien-WMF edited projects for T400871: Reimage sretest2009 as a wikikube worker and assess performance, added: ServiceOps new, ServiceOps-Upgrades-Hardware; removed serviceops.
Thu, Jan 22, 1:57 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, SRE, DC-Ops
MLechvien-WMF added a comment to T400871: Reimage sretest2009 as a wikikube worker and assess performance.

@jasmine_ are you doing this task? Please ask others if you don't find the capacity

Thu, Jan 22, 1:56 PM · ServiceOps-Upgrades-Hardware, ServiceOps new, SRE, DC-Ops
MLechvien-WMF moved T400130: Central REST gateway for APIs from Inbox to Radar on the ServiceOps new board.
Thu, Jan 22, 1:54 PM · ServiceOps-SharedInfra, ServiceOps new, MW-Interfaces-Team (MWI-Roadmap), Epic, OKR-Work
MLechvien-WMF edited projects for T400130: Central REST gateway for APIs, added: ServiceOps new; removed serviceops.
Thu, Jan 22, 1:54 PM · ServiceOps-SharedInfra, ServiceOps new, MW-Interfaces-Team (MWI-Roadmap), Epic, OKR-Work
MLechvien-WMF moved T411256: Draft hCaptcha SLOs, document SLIs from Inbox to Scheduled (this Q) on the ServiceOps new board.
Thu, Jan 22, 1:00 PM · ServiceOps-Services-Oids, ServiceOps new
MLechvien-WMF added a comment to T411780: hcaptcha-proxy: create logstash dashboard.

@Raine are you actively working on that, should this be something we do now or is it future works?

Thu, Jan 22, 12:59 PM · ServiceOps-Services-Oids, ServiceOps new
MLechvien-WMF moved T387007: Reproducible blocking error using the basic upload form, no upload possible from Inbox to Radar on the ServiceOps new board.
Thu, Jan 22, 12:01 PM · ServiceOps-Mediawiki, ServiceOps new, MediaWiki-Uploading, SRE
MLechvien-WMF removed a project from T400263: ☂️ [FY2025-26][Hypothesis] WE6.2.1 Production Readiness Checklist: serviceops.
Thu, Jan 22, 11:27 AM · Epic, ServiceOps new
MLechvien-WMF moved T349376: EtcdConfig using stale data: lost lock in /srv/mediawiki/php-1.42.0-wmf.1/includes/config/EtcdConfig.php on line 218 from Inbox to Needs Info / Blocked on the ServiceOps new board.
Thu, Jan 22, 11:27 AM · ServiceOps-Mediawiki, ServiceOps new, MediaWiki-Engineering
MLechvien-WMF edited projects for T349376: EtcdConfig using stale data: lost lock in /srv/mediawiki/php-1.42.0-wmf.1/includes/config/EtcdConfig.php on line 218, added: ServiceOps new, ServiceOps-Mediawiki; removed serviceops.

@Clement_Goubert could you please help to triage this task? If it's still a concern and we're not realistically putting capacity on that now I'd put it in backlog for this quarter.

Thu, Jan 22, 11:26 AM · ServiceOps-Mediawiki, ServiceOps new, MediaWiki-Engineering
MLechvien-WMF added a project to T415169: Transcode jobs failing with Wikimedia\Rdbms\DBTransactionError: Transaction round stage must be 'cursory' (not 'within-commit'): ServiceOps new.
Thu, Jan 22, 9:13 AM · Patch-For-Review, Reader Growth Team, TimedMediaHandler, MW-Interfaces-Team, Wikimedia-production-error
MLechvien-WMF added a comment to T414486: Upgrade AUX clusters to kubernetes 1.31.

Great, thanks Luca

Thu, Jan 22, 8:58 AM · Infrastructure-Foundations, Kubernetes, Prod-Kubernetes

Wed, Jan 21

MLechvien-WMF moved T388390: Ensure the correct helm version is used for each cluster from Needs Info / Blocked to Scheduled (this Q) on the ServiceOps new board.

Ok good, moving it to Scheduled then

Wed, Jan 21, 7:00 PM · ServiceOps-SharedInfra, ServiceOps new, Patch-For-Review, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF edited projects for T387007: Reproducible blocking error using the basic upload form, no upload possible, added: ServiceOps new, ServiceOps-Mediawiki; removed serviceops.

Apologies for the late follow up. @Grand-Duc Do you still experience the issue here?

Wed, Jan 21, 8:55 AM · ServiceOps-Mediawiki, ServiceOps new, MediaWiki-Uploading, SRE
MLechvien-WMF added a comment to T398611: Migrate all memcached* clusters to nftables.

@Scott_French @jijiki could you update the description with more details about the rationale (why now)? We may consider raising the priority of this.

Wed, Jan 21, 8:36 AM · ServiceOps-Datastores, ServiceOps new

Tue, Jan 20

MLechvien-WMF moved T249663: write some recording rules for queries used in the appserver RED k8s dashboard from Inbox to Radar on the ServiceOps new board.
Tue, Jan 20, 1:47 PM · SRE Observability (FY2025/2026-Q3), Prod-Kubernetes, ServiceOps new, SRE
MLechvien-WMF moved T364400: map the /api/ prefix to /w/rest.php from Inbox to Needs Info / Blocked on the ServiceOps new board.
Tue, Jan 20, 1:04 PM · ServiceOps-SharedInfra, ServiceOps new, Traffic, MW-Interfaces-Team
MLechvien-WMF edited projects for T364400: map the /api/ prefix to /w/rest.php, added: ServiceOps new, ServiceOps-SharedInfra; removed serviceops.

@BPirkle do you have an update on the plans for this?

Tue, Jan 20, 1:04 PM · ServiceOps-SharedInfra, ServiceOps new, Traffic, MW-Interfaces-Team
MLechvien-WMF moved T362954: Fix rendering issue in modules.app.job when cronjobs are enabled and private values are defined from Inbox to Needs Info / Blocked on the ServiceOps new board.
Tue, Jan 20, 12:53 PM · ServiceOps new, Kubernetes
MLechvien-WMF edited projects for T362954: Fix rendering issue in modules.app.job when cronjobs are enabled and private values are defined, added: ServiceOps new; removed serviceops.
Tue, Jan 20, 12:53 PM · ServiceOps new, Kubernetes
MLechvien-WMF assigned T362954: Fix rendering issue in modules.app.job when cronjobs are enabled and private values are defined to jijiki.

@jijiki is this issue still present?

Tue, Jan 20, 12:52 PM · ServiceOps new, Kubernetes
MLechvien-WMF moved T273507: PodSecurityPolicies will be deprecated with Kubernetes 1.21 from Inbox to In Progress on the ServiceOps new board.
Tue, Jan 20, 12:42 PM · ServiceOps new, Patch-For-Review, Prod-Kubernetes
MLechvien-WMF edited projects for T273507: PodSecurityPolicies will be deprecated with Kubernetes 1.21, added: ServiceOps new; removed serviceops.
Tue, Jan 20, 12:42 PM · ServiceOps new, Patch-For-Review, Prod-Kubernetes
MLechvien-WMF moved T396807: Reroute /api/rest_v1 documentation to REST Sandbox from Inbox to In Progress on the ServiceOps new board.
Tue, Jan 20, 12:34 PM · ServiceOps-SharedInfra, ServiceOps new, MW-Interfaces-Team (MWI-Sprint-25 (2026-01-13 to 2026-01-27)), Patch-For-Review, RESTBase Sunsetting, Essential-Work
MLechvien-WMF edited projects for T396807: Reroute /api/rest_v1 documentation to REST Sandbox, added: ServiceOps new, ServiceOps-SharedInfra; removed serviceops.
Tue, Jan 20, 12:33 PM · ServiceOps-SharedInfra, ServiceOps new, MW-Interfaces-Team (MWI-Sprint-25 (2026-01-13 to 2026-01-27)), Patch-For-Review, RESTBase Sunsetting, Essential-Work
MLechvien-WMF moved T388390: Ensure the correct helm version is used for each cluster from Inbox to Needs Info / Blocked on the ServiceOps new board.
Tue, Jan 20, 12:27 PM · ServiceOps-SharedInfra, ServiceOps new, Patch-For-Review, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF edited projects for T388390: Ensure the correct helm version is used for each cluster, added: ServiceOps new, ServiceOps-SharedInfra; removed serviceops.
Tue, Jan 20, 12:26 PM · ServiceOps-SharedInfra, ServiceOps new, Patch-For-Review, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF added a comment to T388390: Ensure the correct helm version is used for each cluster.

@Raine is this still stuck? Please add it to ServiceOps team meeting discussions if inputs are needed.

Tue, Jan 20, 12:26 PM · ServiceOps-SharedInfra, ServiceOps new, Patch-For-Review, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF moved T372943: In the aftermath of T370304: Brainstorming of short- and medium-term observability / quality-of-life production changes from Inbox to Radar on the ServiceOps new board.
Tue, Jan 20, 12:09 PM · ServiceOps new, SRE Observability, Sustainability (Incident Followup), MediaWiki-Platform-Team (Radar), DBA
MLechvien-WMF edited projects for T372943: In the aftermath of T370304: Brainstorming of short- and medium-term observability / quality-of-life production changes, added: ServiceOps new; removed serviceops.
Tue, Jan 20, 12:09 PM · ServiceOps new, SRE Observability, Sustainability (Incident Followup), MediaWiki-Platform-Team (Radar), DBA
MLechvien-WMF moved T380723: Update knative-serving+net-istio to v1.12.x on ML clusters from Inbox to Radar on the ServiceOps new board.
Tue, Jan 20, 11:29 AM · ServiceOps new, Essential-Work, Patch-For-Review, Machine-Learning-Team, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF edited projects for T380723: Update knative-serving+net-istio to v1.12.x on ML clusters, added: ServiceOps new; removed serviceops.
Tue, Jan 20, 11:29 AM · ServiceOps new, Essential-Work, Patch-For-Review, Machine-Learning-Team, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF moved T380722: Update kserve to v0.15.2* on ML clusters from Inbox to Radar on the ServiceOps new board.
Tue, Jan 20, 11:26 AM · ServiceOps new, Essential-Work, Machine-Learning-Team, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF edited projects for T380722: Update kserve to v0.15.2* on ML clusters, added: ServiceOps new; removed serviceops.
Tue, Jan 20, 11:26 AM · ServiceOps new, Essential-Work, Machine-Learning-Team, Data-Platform-SRE, Kubernetes, Prod-Kubernetes
MLechvien-WMF moved T408925: Create a cookbook for memcached management from Inbox to Backlog on the ServiceOps new board.

@jijiki I'm assuming this is not critical for recurring operations or switchover, but please change priority if you disagree.

Tue, Jan 20, 11:23 AM · ServiceOps-Datastores, ServiceOps new, Patch-For-Review
MLechvien-WMF triaged T408925: Create a cookbook for memcached management as Low priority.

@jijiki in which situation would that cookbook be useful?

Tue, Jan 20, 11:04 AM · ServiceOps-Datastores, ServiceOps new, Patch-For-Review
MLechvien-WMF edited projects for T330997: Support locking cookbooks run except for switchover related cookbooks, added: ServiceOps new; removed serviceops.

@Blake could you move this on the board if you plan to do it this quarter?

Tue, Jan 20, 10:56 AM · ServiceOps new, SRE-tools, Infrastructure-Foundations, Datacenter-Switchover, SRE
MLechvien-WMF closed T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s as Resolved.

@jasmine_ please reopen if needed

Tue, Jan 20, 10:43 AM · Patch-For-Review, Datacenter-Switchover, serviceops, MW-on-K8s
MLechvien-WMF closed T359130: Update DC switchover cookbooks to handle maintenance scripts on k8s, a subtask of T341560: Migrate mwmaint server functionality to mw-on-k8s, as Resolved.
Tue, Jan 20, 10:43 AM · serviceops, MW-on-K8s
MLechvien-WMF added a comment to T375014: Support listing pooled / active authdns hosts (rather than all).

Hi @Volans can I confirm the status of this task?

Tue, Jan 20, 10:24 AM · Patch-For-Review, Infrastructure-Foundations, SRE-tools, Spicerack
MLechvien-WMF merged T375285: sre.discovery.datacenter should handle depooled dnsbox hosts into T375014: Support listing pooled / active authdns hosts (rather than all).
Tue, Jan 20, 10:22 AM · Patch-For-Review, Infrastructure-Foundations, SRE-tools, Spicerack
MLechvien-WMF merged task T375285: sre.discovery.datacenter should handle depooled dnsbox hosts into T375014: Support listing pooled / active authdns hosts (rather than all).
Tue, Jan 20, 10:21 AM · ServiceOps new, Patch-For-Review, Datacenter-Switchover
MLechvien-WMF added a comment to T375285: sre.discovery.datacenter should handle depooled dnsbox hosts.

Makes sense, then I'll close this one as duplicate of https://phabricator.wikimedia.org/T375014 and follow up on the other ticket. @Scott_French or @Blake feel free to reopen if you disagree

Tue, Jan 20, 9:57 AM · ServiceOps new, Patch-For-Review, Datacenter-Switchover
MLechvien-WMF moved T400100: FY 25/26 WE 5.4.2: Known bots / clients from Inbox to Radar on the ServiceOps new board.
Tue, Jan 20, 9:06 AM · Epic, ServiceOps new, SRE

Mon, Jan 19

MLechvien-WMF moved T399291: Epic: API Rate Limiting Architecture from Inbox to Radar on the ServiceOps new board.
Mon, Jan 19, 5:03 PM · ServiceOps-SharedInfra, ServiceOps new, MediaWiki-Platform-Team (Radar), Traffic, Epic, OKR-Work, MW-Interfaces-Team, FY2025-26 KR 5.1
MLechvien-WMF edited projects for T399291: Epic: API Rate Limiting Architecture, added: ServiceOps new; removed serviceops.
Mon, Jan 19, 5:03 PM · ServiceOps-SharedInfra, ServiceOps new, MediaWiki-Platform-Team (Radar), Traffic, Epic, OKR-Work, MW-Interfaces-Team, FY2025-26 KR 5.1