User Details
- User Since
- Feb 7 2022, 1:09 PM (213 w, 1 d)
- Roles
- Administrator
- Availability
- Available
- LDAP User
- Jaime Nuche
- MediaWiki User
- JNuche (WMF) [ Global Accounts ]
Today
We know from T405224 that we can recover the entire cluster from the K3s data volumes. Additionally, the folks over at Cloud have told us in the past that it's not possible to automate/schedule the creation of OpenStack snapshots natively.
Yesterday
As usual, we will announce a maintenance window for the migration.
Mon, Mar 9
Thu, Mar 5
Operation times in production have significantly improved and gone back to the performance levels we had by the end of November 2025.
Wed, Mar 4
Tue, Feb 24
Mon, Feb 23
Similarly to what happened with secrets, recreated envs have been leaving behind a significant number of replica sets behind:
kubectl -n cat-env get rs | grep 750c4d946d wiki-750c4d946d-3895-mediawiki-66bccd4d77 0 0 0 15d wiki-750c4d946d-3895-mediawiki-8669b898df 0 0 0 15d wiki-750c4d946d-3895-mediawiki-668466c6b8 0 0 0 14d wiki-750c4d946d-3895-mediawiki-c556c45c5 0 0 0 14d wiki-750c4d946d-3895-mediawiki-76c5fb6d87 0 0 0 14d wiki-750c4d946d-3895-mediawiki-6968d99c4d 0 0 0 14d wiki-750c4d946d-3895-mediawiki-6b5b86ff6f 0 0 0 14d wiki-750c4d946d-3895-mediawiki-68cd949987 0 0 0 14d wiki-750c4d946d-3895-mediawiki-75749d974b 0 0 0 14d wiki-750c4d946d-3895-mediawiki-5d46bb4dc5 0 0 0 14d wiki-750c4d946d-3895-mediawiki-68c5994db9 1 1 1 11d
Fri, Feb 20
Some of the helm commands had started slowing to a crawl; as it turns out we had 912 secrets in the cat-env namespace. A majority of those were originating in helm history revisions. A lot of those revisions could be safely removed and after doing that helm commands became noticeably more responsive again. Plus env creation times have gone down again to the levels we had a few months ago:
Thu, Feb 19
I found out that:
- I can build old branches such as REL1_39 using the most recent images, which use PHP 8.3 (I did see REL1_42 fail but that seemed an unrelated issue with an extension)
- The tree container images used by the main pod in the mediawiki chart they all get a latest tag
- I could successfully recover env https://568a59f193.catalyst.wmcloud.org/wiki/Main_Page by creating a branched version of the chart the env was built with. Then I modified it to use the latest images and updated the env to use that modified chart
Wed, Feb 18
The original issue is fixed, but when trying to rebuild 568a59f193 I ran into this:
Loading composer repositories with package information Updating dependencies Your requirements could not be resolved to an installable set of packages.
Yet another interesting situation. Two patches were pushed in quick succession: https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikiLambda/+/1240009
Tue, Feb 17
Relevant data points:
Another data point. Today the Jenkins Wikilambda CI left a bunch of envs behind:
mw-ext-wl-ci-1238784-24724-4109-py-evaluator-547786dd5c-snpqv 1/1 Running 0 85m mw-ext-wl-ci-1238784-24724-4109-js-evaluator-97fff64f4-trkbd 1/1 Running 0 85m mw-ext-wl-ci-1238784-24724-4109-artifact-warehouse 1/1 Running 0 85m mw-ext-wl-ci-1238784-24724-4109-mariadb-5b4685c7b9-t6qgz 1/1 Running 0 85m mw-ext-wl-ci-1238784-24724-4109-mediawiki-88494f449-jszph 4/4 Running 0 85m mw-ext-wl-ci-1239152-82482-4110-js-evaluator-7dd6465b6f-bshbn 1/1 Running 0 79m mw-ext-wl-ci-1239152-82482-4110-py-evaluator-c756459cd-5njgb 1/1 Running 0 79m mw-ext-wl-ci-1239344-35683-4111-js-evaluator-7cc5cc6dc4-qfsfd 1/1 Running 0 79m mw-ext-wl-ci-1239344-35683-4111-py-evaluator-54797b96f7-q75zx 1/1 Running 0 79m mw-ext-wl-ci-1239152-82482-4110-artifact-warehouse 1/1 Running 0 79m mw-ext-wl-ci-1239344-35683-4111-artifact-warehouse 1/1 Running 0 79m mw-ext-wl-ci-1239152-82482-4110-mariadb-689dbdc869-t6dm7 1/1 Running 0 79m mw-ext-wl-ci-1239344-35683-4111-mariadb-f95cb96cc-tp55n 1/1 Running 0 79m mw-ext-wl-ci-1239345-10059-4112-py-evaluator-65b68bb7c5-hw8xd 1/1 Running 0 78m mw-ext-wl-ci-1239345-10059-4112-js-evaluator-69c4755b7b-lqnnw 1/1 Running 0 78m mw-ext-wl-ci-1239345-10059-4112-artifact-warehouse 1/1 Running 0 78m mw-ext-wl-ci-1239345-10059-4112-mariadb-68bd8bc87c-6bjzp 1/1 Running 0 78m mw-ext-wl-ci-1239152-59560-4113-py-evaluator-587bc78bc8-gksvs 1/1 Running 0 78m mw-ext-wl-ci-1239152-59560-4113-artifact-warehouse 1/1 Running 0 78m mw-ext-wl-ci-1239152-59560-4113-mariadb-68f895bf85-469ls 1/1 Running 0 78m mw-ext-wl-ci-1239152-59560-4113-js-evaluator-67689db7c7-wpxgn 1/1 Running 0 78m mw-ext-wl-ci-1239344-62262-4114-js-evaluator-55cc55dff8-kmd6x 1/1 Running 0 77m mw-ext-wl-ci-1239344-62262-4114-artifact-warehouse 1/1 Running 0 77m mw-ext-wl-ci-1239344-62262-4114-mariadb-65c79bbb57-k24gs 1/1 Running 0 77m mw-ext-wl-ci-1239344-62262-4114-py-evaluator-75c4f5cff7-b8778 1/1 Running 0 77m mw-ext-wl-ci-1239344-35683-4111-mediawiki-6cf876bb58-4vmtz 4/4 Running 0 79m mw-ext-wl-ci-1239344-62262-4114-mediawiki-5d54bcdcc9-gqmpt 4/4 Running 0 77m mw-ext-wl-ci-1239152-82482-4110-mediawiki-5fb7ddb8cb-b9jhs 4/4 Running 0 79m mw-ext-wl-ci-1239345-10059-4112-mediawiki-5c64d87fb7-9x9j6 4/4 Running 0 78m mw-ext-wl-ci-1239152-59560-4113-mediawiki-668b75b6c5-444qf 4/4 Running 0 78m
Mon, Feb 16
thank you @Raymond_Ndibe
@Raymond_Ndibe We use catalyst-dev to test changes to our infrastructure before rolling those changes out to the production project catalyst, so we generally need the same resources there.
Thu, Feb 12
Seems like still happening as of 1.46.0-wmf.15
I caught one in the wild, this time from gitlab: https://gitlab.wikimedia.org/repos/test-platform/catalyst/catalyst-ci-client/-/pipelines/164402
@Urbanecm_WMF thank you for taking care of this
Wed, Feb 11
Noise from Nostalgia is gone. Thank you to @Tgr for his patch.
Unsurprisingly Nostalgia is very much unsupported. I'm pinging the most "recent" (talking about months here) commiters to the repo in case they think they can help: @Reedy @matmarex @Umherirrender
Errors shot up after deploying the train to group1. 3K+ in 15m:
Tue, Feb 10
@Jdforrester-WMF Much obliged again good sir
@Jdforrester-WMF Thank you so much!
Another two separated warnings are also generated per request. For the request ID mentioned above we can see the following in the logs:
Mon, Feb 9
Unfortunately it seems the bot-wrangler-traefik-plugin plugin is not compatible with our current K3s production version v1.28.7+k3s1.
Feb 6 2026
Feb 5 2026
A Jenkins upgrade to a new version finally happened as part of this security advisory ticket: T412694. There were no issues reported.