Page MenuHomePhabricator
Feed Advanced Search

Yesterday

andrea.denisse awarded T243288: Retire the Tor relay a Heartbreak token.
Tue, May 14, 2:45 PM · Tor, SRE
MoritzMuehlenhoff added a comment to T360356: Request access to servers Dcops group.

But isn't it simper to just grep in the output of a single cookbook as opposed to grep the output of multiple tools?

Tue, May 14, 10:32 AM · SRE, Infrastructure-Foundations
MoritzMuehlenhoff added a comment to T364823: Upgrade r/w LDAP servers to Bullseye.

serpens has been migrated to Bullseye, seaborgium to follow in a few days.

Tue, May 14, 9:26 AM · LDAP, SRE, Infrastructure-Foundations
MoritzMuehlenhoff created T364824: Check home/HDFS leftovers of bdgreenlee.
Tue, May 14, 8:24 AM · Data-Platform-SRE
MoritzMuehlenhoff renamed T364823: Upgrade r/w LDAP servers to Bullseye from Upgrade r/w LDAp servers to Bullseye to Upgrade r/w LDAP servers to Bullseye.
Tue, May 14, 8:13 AM · LDAP, SRE, Infrastructure-Foundations
MoritzMuehlenhoff created T364823: Upgrade r/w LDAP servers to Bullseye.
Tue, May 14, 8:12 AM · LDAP, SRE, Infrastructure-Foundations

Mon, May 13

MoritzMuehlenhoff added a comment to T363209: Q4:rack/setup/install kafka-main200[6789] & kafka-main2010.

I think the reason the installation failed is because there is no entry in site.pp yet.

Mon, May 13, 3:17 PM · SRE, ops-codfw, serviceops, DC-Ops
MoritzMuehlenhoff added a comment to T363209: Q4:rack/setup/install kafka-main200[6789] & kafka-main2010.

All insetup roles default to Puppet 7 these days (as does the kafka-main roler itself), so these should be installed with Puppet 7.

Mon, May 13, 3:16 PM · SRE, ops-codfw, serviceops, DC-Ops
MoritzMuehlenhoff triaged T364746: Site: eqiad 3 VM request for staging-eqiad kube-apiserver as Medium priority.

LGTM

Mon, May 13, 2:28 PM · SRE, Infrastructure-Foundations, vm-requests, serviceops, Prod-Kubernetes, Kubernetes
MoritzMuehlenhoff triaged T364740: Site: codfw 2 VM request for staging-codfw kube-apiserver as Medium priority.

LGTM

Mon, May 13, 2:05 PM · SRE, Infrastructure-Foundations, vm-requests, Prod-Kubernetes, Kubernetes
MoritzMuehlenhoff triaged T364622: Review/cleanup content of /srv/private/modules/secret/secrets/ssl in the private repo as High priority.
Mon, May 13, 2:04 PM · Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T362681: Provide nodejs20 base images for production.

I kicked off a build of the node20 image, it should hopefully appear in the registry soon.

Not there yet; did the build get stuck/break?

OK, after the weekend it's now shown up, and works great. Thank you!

Mon, May 13, 1:53 PM · serviceops
MoritzMuehlenhoff added a comment to T360779: Phase out cergen for Fundraising services.

@Dwisehaupt @Jgreen The kafka cert issued by the PKI is now getting deployed to /etc/fr-tech-kafka-client on cumin1002/cumin2002. Could you please sync it to fr-tech and test/deploy instead of the old cergen-issued cert? When this has been confirmed to work fine, we can add a systemd timer which sends a notification if the key renewal is forthcoming.

Mon, May 13, 11:59 AM · Patch-For-Review, Fundraising-Backlog
MoritzMuehlenhoff updated the task description for T360596: Figure out a plan to move forward with regarding Redis License changes.
Mon, May 13, 11:53 AM · GitLab (Infrastructure), Patch-For-Review, User-aborrero, serviceops, MediaWiki-Platform-Team (Radar), collaboration-services, Release-Engineering-Team (Radar), Quarry, Toolforge, Software-Licensing, Infrastructure-Foundations, netbox, Platform Team Initiatives (API Gateway), ChangeProp, MediaWiki-File-management, SRE
MoritzMuehlenhoff added a comment to T360596: Figure out a plan to move forward with regarding Redis License changes.

Redict is now packaged in Debian: https://tracker.debian.org/pkg/redict

Mon, May 13, 11:52 AM · GitLab (Infrastructure), Patch-For-Review, User-aborrero, serviceops, MediaWiki-Platform-Team (Radar), collaboration-services, Release-Engineering-Team (Radar), Quarry, Toolforge, Software-Licensing, Infrastructure-Foundations, netbox, Platform Team Initiatives (API Gateway), ChangeProp, MediaWiki-File-management, SRE
MoritzMuehlenhoff renamed T364622: Review/cleanup content of /srv/private/modules/secret/secrets/ssl in the private repo from Review/cleanup content of /srv/private/modules/secret/secrets in the private repo to Review/cleanup content of /srv/private/modules/secret/secrets/ssl in the private repo.
Mon, May 13, 11:43 AM · Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Mon, May 13, 7:23 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Fri, May 10

MoritzMuehlenhoff created T364622: Review/cleanup content of /srv/private/modules/secret/secrets/ssl in the private repo.
Fri, May 10, 1:47 PM · Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff closed T364016: Q4:magru VM tracking task as Resolved.
Fri, May 10, 10:12 AM · Traffic, Infrastructure-Foundations
MoritzMuehlenhoff updated the task description for T357750: Phase out cergen.
Fri, May 10, 5:54 AM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T360414: Phase out cergen for Observability services.

Nice work!

Fri, May 10, 5:51 AM · Patch-For-Review, SRE Observability (FY2023/2024-Q4), observability, SRE

Thu, May 9

Dzahn awarded T360414: Phase out cergen for Observability services a Barnstar token.
Thu, May 9, 3:11 PM · Patch-For-Review, SRE Observability (FY2023/2024-Q4), observability, SRE

Wed, May 8

MoritzMuehlenhoff added a comment to T362681: Provide nodejs20 base images for production.

I kicked off a build of the node20 image, it should hopefully appear in the registry soon.

Wed, May 8, 2:59 PM · serviceops
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, May 8, 1:55 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff closed T279509: git-fat replacement/removal as Resolved.

Awesome, it is great to see repositories have been migrated. I am reopening this one though to uninstall the git-fat Debian package, it is based on Python 2 and that is blocking the migration out of Buster/Bullseye. We still have a bunch of reference to it in Puppet.

Ideally git-fat would be purged and then python2 removed, several Puppet roles have:

Wed, May 8, 7:29 AM · Release-Engineering-Team (Yakisfaction)
MoritzMuehlenhoff closed T364373: Remove git-fat from Puppet as Resolved.

The remaining traces of git-fat have been removed from Puppet and git-fat uninstalled fleet-wide.

Wed, May 8, 7:23 AM · Release-Engineering-Team (Yakisfaction)
MoritzMuehlenhoff closed T364373: Remove git-fat from Puppet , a subtask of T279509: git-fat replacement/removal, as Resolved.
Wed, May 8, 7:23 AM · Release-Engineering-Team (Yakisfaction)

Tue, May 7

MoritzMuehlenhoff updated the task description for T357144: Integrate Bullseye 11.9 point update.
Tue, May 7, 3:07 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Tue, May 7, 10:45 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T279509: git-fat replacement/removal.

Awesome, it is great to see repositories have been migrated. I am reopening this one though to uninstall the git-fat Debian package, it is based on Python 2 and that is blocking the migration out of Buster/Bullseye. We still have a bunch of reference to it in Puppet.

Ideally git-fat would be purged and then python2 removed, several Puppet roles have:

# python 2 is required for git-fat
profile::base::remove_python2_on_bullseye: false

There is also all the bits in Archiva which were made to support git-fat then I know Archiva is no more actively maintained and it will be entirely phased out from our infrastructure (eg T358612 ). So I guess the supporting bits will be removed as Archiva is removed.

Tue, May 7, 8:54 AM · Release-Engineering-Team (Yakisfaction)
MoritzMuehlenhoff created T364373: Remove git-fat from Puppet .
Tue, May 7, 8:52 AM · Release-Engineering-Team (Yakisfaction)

Mon, May 6

MoritzMuehlenhoff added a comment to T364342: Switch Gerrit from Java 11 to Java 17.

gerrit1003 is on Bullseye, not Buster. And Bullseye also provides OpenJDK17 in parallel to 11, so you can switch to 17 without any OS change, but a simple Hiera change in profile::java

Mon, May 6, 8:40 PM · Release-Engineering-Team, Gerrit, collaboration-services
MoritzMuehlenhoff added a comment to T364302: Start the Mitre CNA Partner Process for the Wikimedia Foundation.

+1 on becoming a CNA for Mediawiki core and extensions.

Mon, May 6, 10:25 AM · Security-Team
MoritzMuehlenhoff renamed T357760: CVE-2024-34506: Denial of service vector via GET request to Special:MovePage on pages with thousands of subpages from CVE-2024-: Denial of service vector via GET request to Special:MovePage on pages with thousands of subpages to CVE-2024-34506: Denial of service vector via GET request to Special:MovePage on pages with thousands of subpages.
Mon, May 6, 10:22 AM · MW-1.42-notes (1.42.0-wmf.26; 2024-04-09), MW-1.41-notes, MW-1.40-notes, MW-1.39-notes, SecTeam-Processed, Patch-For-Review, MediaWiki-Page-rename, Vuln-DoS, Security, Security-Team
MoritzMuehlenhoff renamed T355538: CVE-2024-34507: XSS in edit summary parser from CVE-2024-: XSS in edit summary parser to CVE-2024-34507: XSS in edit summary parser.
Mon, May 6, 10:22 AM · MW-1.42-notes (1.42.0-wmf.25; 2024-04-02), SecTeam-Processed, Patch-For-Review, Vuln-XSS, Security, Security-Team
MoritzMuehlenhoff updated the task description for T364016: Q4:magru VM tracking task.
Mon, May 6, 9:56 AM · Traffic, Infrastructure-Foundations

Fri, May 3

MoritzMuehlenhoff closed T363978: Set up Ganeti clusters in magru as Resolved.

The two clusters (magru01 and magru02) are setup and initial VMs have been created already.

Fri, May 3, 11:44 AM · Ganeti, Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T364016: Q4:magru VM tracking task.
Fri, May 3, 11:39 AM · Traffic, Infrastructure-Foundations

Thu, May 2

MoritzMuehlenhoff updated the task description for T364016: Q4:magru VM tracking task.
Thu, May 2, 3:37 PM · Traffic, Infrastructure-Foundations
MoritzMuehlenhoff updated the task description for T364016: Q4:magru VM tracking task.
Thu, May 2, 3:33 PM · Traffic, Infrastructure-Foundations
MoritzMuehlenhoff triaged T364016: Q4:magru VM tracking task as High priority.
Thu, May 2, 3:31 PM · Traffic, Infrastructure-Foundations
MoritzMuehlenhoff updated the task description for T357750: Phase out cergen.
Thu, May 2, 12:58 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, May 2, 12:23 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T363996: Sessionstore's discovery TLS cert will expire before end of May 2024.

This certificate doesn't show up anywhere in certificate.manifests.d for cergen, though?

Thu, May 2, 11:54 AM · serviceops, Data-Persistence
MoritzMuehlenhoff claimed T363978: Set up Ganeti clusters in magru.
Thu, May 2, 9:18 AM · Ganeti, Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T363978: Set up Ganeti clusters in magru.
Thu, May 2, 9:18 AM · Ganeti, Infrastructure-Foundations, SRE

Tue, Apr 30

MoritzMuehlenhoff updated the task description for T362730: Q4:rack/setup/install magru misc servers.
Tue, Apr 30, 2:55 PM · Traffic, netops, ops-magru, Infrastructure-Foundations, DC-Ops
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Tue, Apr 30, 1:55 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T362730: Q4:rack/setup/install magru misc servers.
Tue, Apr 30, 1:53 PM · Traffic, netops, ops-magru, Infrastructure-Foundations, DC-Ops
MoritzMuehlenhoff updated the task description for T362730: Q4:rack/setup/install magru misc servers.
Tue, Apr 30, 12:51 PM · Traffic, netops, ops-magru, Infrastructure-Foundations, DC-Ops
MoritzMuehlenhoff added a comment to T361286: Fatal error detected on elastic2088.

I've set the server back to "Active" in Netbox.

Tue, Apr 30, 11:04 AM · SRE, ops-codfw, Data-Platform-SRE

Fri, Apr 26

MoritzMuehlenhoff added a comment to T363559: eqiad: 1 VMs requested for ceph cluster administration (cephadm).

Looks good. Best to create it in group D

Fri, Apr 26, 10:47 AM · SRE, vm-requests
MoritzMuehlenhoff added a comment to T363452: Striker/Horizon are running in Blubber built containers with a runtime UID that does not exist on the host machine.

Those are running in containers, and the user does exist in the container:

Fri, Apr 26, 7:58 AM · Horizon, Striker, cloud-services-team

Thu, Apr 25

MoritzMuehlenhoff added a comment to T361087: backup1005 crashed.

Booting failed (PXE):

PXELINUX 6.03 lwIP 20150819 Copyright (C) 1994-2014 H. Peter Anvin et al


Debian 12 (bookworm) amd64 (Wikimedia edition)

                                              boot: 
Loading debian-installer/amd64/linux... ok
Loading debian-installer/amd64/initrd.gz...
Boot failed: press a key to retry, or wait for reset...

Hmm. Not sure if we've seen this problem before. DHCP clearly worked as did the debian image download, but Linux failed to load for some reason.

@jcrespo the only difference was selecting bullseye rather than bookworm on the second attempt?

Yes. Check with @MoritzMuehlenhoff he did something to fix something, but not sure what, or if it applies here.

Thu, Apr 25, 3:36 PM · SRE, ops-eqiad, DC-Ops, Data-Persistence-Backup, media-backups
MoritzMuehlenhoff updated the task description for T357750: Phase out cergen.
Thu, Apr 25, 3:22 PM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T363399: Q4:rack/setup/install parsoidtest1001.

Will parsoidtest1001 be installed with Bullseye? scandium is currently running buster, but all the mediawiki manifests are compatible with bullseye (cloudweb already runs it), and so is the component/php74.

Thu, Apr 25, 3:11 PM · Patch-For-Review, SRE, serviceops, ops-eqiad, DC-Ops
MoritzMuehlenhoff added a comment to T362981: Migrate mw-on-k8s base image from buster to bullseye.

All the custom PHP extensions are already fully rebuilt for the component/php74 for bullseye! And being kept updated, e.g. the recent PHP security fixes were backported to both component/icu67 (buster) and component/php74 (bullseye).

In fact we're already using them on the cloudweb hosts (and the snapshot hosts using bullseye). So from that perspective the migration could happen any time, as long as we're okay with the remaining percentage of baremetal traffic running on Buster until it's fully shrunk to zero.

Hmm. Locally I get:

$ docker-pkg -c config.yaml --info build images/php --select '*/php7.4-cli:*'

2024-04-19 09:13:25 [docker-pkg-build] INFO - E: Unable to locate package php7.4-excimer
E: Couldn't find any package by glob 'php7.4-excimer'
E: Couldn't find any package by regex 'php7.4-excimer'
Thu, Apr 25, 3:01 PM · Patch-For-Review, serviceops, MW-on-K8s
MoritzMuehlenhoff added a comment to T291916: Tracking task for Bullseye migrations in production.

Ok, fair enough about the tracking task. But don't we still need some kind of task that someone can take to do the actual upgrade work? So all the subtasks without the tracking parent task?

Thu, Apr 25, 2:15 PM · Epic, Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T291916: Tracking task for Bullseye migrations in production.

@Muehlenhoff Where does deploy* (deployment_server role both prod and wmcs) fit in? Since we are still on buster there. But want bullseye deployment_servers in cloud VPS projects and production hasn't upgraded the role yet. A legit subtask for here?

Thu, Apr 25, 12:42 PM · Epic, Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, Apr 25, 12:36 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T363452: Striker/Horizon are running in Blubber built containers with a runtime UID that does not exist on the host machine.
Thu, Apr 25, 9:43 AM · Horizon, Striker, cloud-services-team
MoritzMuehlenhoff added a comment to T284145: Clean up OTRS/Znuny addresses handles by gsuite.

@jbond my utmost apologies for not replying to this earlier! These errors can be ignored, they will never go away AFAIK. These are the result of email addresses that used to be processed through OTRS/Znuny LTS, but were usurped by gsuite handling. When this occurs the old emails were invalidated and disabled from OTRS/Znuny LTS, but they cannot be deleted and so they will forever log errors unless/until the ability to delete unused emails is added as functionality.

Thu, Apr 25, 7:55 AM · collaboration-services, Infrastructure-Foundations, Mail, Znuny, User-jbond

Wed, Apr 24

MoritzMuehlenhoff added a comment to T362746: Upgrade s4 to MariaDB 10.6.
Wed, Apr 24, 1:46 PM · DBA
MoritzMuehlenhoff added a comment to T363310: Site: codfw 1 VM request for staging-codfw kube-apiserver.

Looks good. We can't disable DRBD on instance creation currently, simply add it as usual and then you can use the sre.ganeti.changedisk cookbook to switch to plain disks.

Wed, Apr 24, 11:30 AM · Patch-For-Review, vm-requests, Infrastructure-Foundations, SRE, serviceops, Prod-Kubernetes, Kubernetes

Tue, Apr 23

MoritzMuehlenhoff added a comment to T363125: sustainability of wikitech.wikimedia.org.

I started out thinking that but I don't think it's so easy to switch to RO ldap. The ldap mw extension will still expect to be able to change things (email address, for example)

Tue, Apr 23, 1:38 PM · wikitech.wikimedia.org, Security, Epic, cloud-services-team
MoritzMuehlenhoff added a comment to T360439: Phase out cergen for Search Platform services.

I'll check to see if there is any code ready to deploy cfssl based certificates for nginx.

Tue, Apr 23, 12:50 PM · Data-Platform-SRE (2024.04.15 - 2024.05.05), SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Tue, Apr 23, 10:36 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T359767: Wrong time zone for Kazakhstan (defined by Debian tzdata package).

@MuratKaribay Can you please retry? I just tested changing the timezone to Almaty and Kostanai and it gave me a five hour offset.

Tue, Apr 23, 9:36 AM · Infrastructure-Foundations
MoritzMuehlenhoff renamed T362628: Find a way to stage updated OS packages on wikikube from Find a way to stage updated PHP packages on wikikube to Find a way to stage updated OS packages on wikikube.
Tue, Apr 23, 8:36 AM · Release-Engineering-Team, serviceops, MW-on-K8s, Scap
MoritzMuehlenhoff added a comment to T363125: sustainability of wikitech.wikimedia.org.

B: Fishbowl wiki hosted on wikikube, accounts in ldap. This option could be a final state OR a temporary state on the way to the SUL option.
Con (long-term): Allowing r/w ldap access from wikitech/wikikube may continue to introduce surprising edge cases for product maintenance.

I don't think Wikitech will require r/w access after T359544: Disable SSH key management on Wikitech is done.

Tue, Apr 23, 7:55 AM · wikitech.wikimedia.org, Security, Epic, cloud-services-team
MoritzMuehlenhoff created T363128: Check home/HDFS leftovers of dbad2021.
Tue, Apr 23, 6:54 AM · Data-Platform-SRE (2024.05.06 - 2024.05.26)
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Tue, Apr 23, 6:42 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE

Fri, Apr 19

MoritzMuehlenhoff added a comment to T362981: Migrate mw-on-k8s base image from buster to bullseye.

All the custom PHP extensions are already fully rebuilt for the component/php74 for bullseye! And being kept updated, e.g. the recent PHP security fixes were backported to both component/icu67 (buster) and component/php74 (bullseye).

Fri, Apr 19, 1:11 PM · Patch-For-Review, serviceops, MW-on-K8s
MoritzMuehlenhoff claimed T362681: Provide nodejs20 base images for production.

That's not problem. We should just use the nodesource packages for this, we've been doing the same for "intermediate LTSes" before (e.g. node 16 or node 14) not covered by an intree Debian nodejs version. I'll work on this next week.

Fri, Apr 19, 12:07 PM · serviceops
MoritzMuehlenhoff created P61011 Example output of Puppet generator to create /var/lib/ganeti/known_hosts.
Fri, Apr 19, 11:00 AM

Thu, Apr 18

MoritzMuehlenhoff added a comment to T362107: eqiad: 3x VM request for new opensearch cluster.

Looks good to me.

Thu, Apr 18, 3:32 PM · Data-Platform-SRE, vm-requests, Infrastructure-Foundations, SRE
MoritzMuehlenhoff closed T357133: Integrate Bookworm 12.5 point update as Resolved.

This is complete

Thu, Apr 18, 2:32 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T357133: Integrate Bookworm 12.5 point update.
Thu, Apr 18, 2:31 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T357133: Integrate Bookworm 12.5 point update.
Thu, Apr 18, 2:31 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T357133: Integrate Bookworm 12.5 point update.
Thu, Apr 18, 2:23 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, Apr 18, 1:50 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, Apr 18, 12:31 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff claimed T227650: Migrate web services using LDAP authentication towards the readonly LDAP replicas.
Thu, Apr 18, 11:17 AM · User-jbond, LDAP, SRE
MoritzMuehlenhoff placed T227650: Migrate web services using LDAP authentication towards the readonly LDAP replicas up for grabs.
Thu, Apr 18, 11:17 AM · User-jbond, LDAP, SRE
MoritzMuehlenhoff added a comment to T362628: Find a way to stage updated OS packages on wikikube.
  • If a new image found to be okay, have some script/option/tool to promote the current staging image as the new main production images

Do we want this to be done forced, or should we rather rely on the Monday update automatically promoting ?

Thu, Apr 18, 11:16 AM · Release-Engineering-Team, serviceops, MW-on-K8s, Scap
MoritzMuehlenhoff added a comment to T362852: Jenkins core security advisory 2024-04-17.

I am fine skipping this update.

Thu, Apr 18, 10:57 AM · Security, Release-Engineering-Team, Continuous-Integration-Infrastructure, Jenkins
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, Apr 18, 10:10 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Thu, Apr 18, 9:40 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T362852: Jenkins core security advisory 2024-04-17.

Do you still want me to import the latest LTS release (like for non-security fixes) or shall we skip this update entirely?

Thu, Apr 18, 9:02 AM · Security, Release-Engineering-Team, Continuous-Integration-Infrastructure, Jenkins
MoritzMuehlenhoff closed T359767: Wrong time zone for Kazakhstan (defined by Debian tzdata package) as Resolved.

Updated packages have been released and installed on all Wikimedia production systems. Cloud VPS instances are automatically updated via a nightly cron job, so should generally also be updated by now.

Thu, Apr 18, 7:25 AM · Infrastructure-Foundations

Wed, Apr 17

MoritzMuehlenhoff added a comment to T350179: Reimage cookbook on new eqiad hosts stuck at PXE booting.

@Papaul deserves a lot of love for fixing this persistent issue. The 21.x firmware (specifically, Network_Firmware_YK81Y_WN64_21.60.22.11_03) worked in the first attempt when reimaging cp1114. I think we can consider this closed given we have observed the fix on two hosts now.

Thanks, 1-800-Call-Papaul!

Wed, Apr 17, 3:25 PM · SRE, Traffic, SRE-swift-storage, ops-codfw, DC-Ops, ops-eqiad
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, Apr 17, 2:44 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T362628: Find a way to stage updated OS packages on wikikube.

This isn't just limited to updating PHP, but also extends to the full OS stack underneath (libs used by PHP etc). When those currently get updated in Debian (via a security update or a point release), foe baremetal I check the context of how it's used, keep an eye on regressions and necessary restarts and usually roll out some canaries first. Typically there's always a few updates in flight at any point in time.

Wed, Apr 17, 2:02 PM · Release-Engineering-Team, serviceops, MW-on-K8s, Scap
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, Apr 17, 1:44 PM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, Apr 17, 11:52 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, Apr 17, 10:04 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T349619: Migrate roles to puppet7.
Wed, Apr 17, 8:46 AM · Patch-For-Review, Data-Platform-SRE (2024.05.06 - 2024.05.26), serviceops, collaboration-services, SRE-tools, Puppet-Core, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T362678: Package request: install elixir and erlang-otp to the analytics clients.

It's worth nothing that the stat hosts are on Bullseye/Debian 11, which being provides the following versions:

  • Erlang 23.2.6
  • Elixir 1.10.3
  • cmake 3.18.4
Wed, Apr 17, 7:22 AM · Data-Platform-SRE, Data-Engineering
MoritzMuehlenhoff updated the task description for T357750: Phase out cergen.
Wed, Apr 17, 7:16 AM · Patch-For-Review, Puppet-Infrastructure, Puppet (Puppet 7.0), Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T362518: Deprecate buster-backports.

This has also broken building CI images. Will have to migrate them to bullseye immediately, I suppose.

Wed, Apr 17, 7:01 AM · Infrastructure-Foundations, Release-Engineering-Team, serviceops