Page MenuHomePhabricator
Feed Advanced Search

Today

MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Wed, Oct 20, 7:54 AM · Infrastructure-Foundations, SRE

Yesterday

MoritzMuehlenhoff added a comment to T271736: Migrate WMF Production from PHP 7.2 to PHP 7.4.

There's no reason for T263437 to be a sub task? It's unrelated work and only needed when we move to a new OS (with a new ICU), but not when we merely migrate to a new PHP release.

Tue, Oct 19, 5:37 PM · serviceops
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Tue, Oct 19, 12:22 PM · Infrastructure-Foundations, SRE

Mon, Oct 18

MoritzMuehlenhoff updated subscribers of T293676: Check home/HDFS leftovers of tonina.

Point of contact for any data which might possibly need to be retained is @WMDE-leszek

Mon, Oct 18, 6:08 PM · Analytics
MoritzMuehlenhoff added a comment to T293621: Offboard Tonina Zhelyazkova from WMF systems.

Created https://phabricator.wikimedia.org/T293676 for the Data Engineering team to review data in HDFS/stat homes.

Mon, Oct 18, 6:05 PM · SRE, Gerrit-Privilege-Requests, SRE-Access-Requests, LDAP-Access-Requests
MoritzMuehlenhoff created T293676: Check home/HDFS leftovers of tonina.
Mon, Oct 18, 6:05 PM · Analytics
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 18, 9:25 AM · Infrastructure-Foundations, SRE

Fri, Oct 15

MoritzMuehlenhoff added a comment to T268233: Thanos and Grafana lose the session after an hour.

Thanks Timo This quarter we're drafting our plans for requirements in terms of 2FA (and the implied tradeoffs between convenience and security). We can/will also revisit session times as part of that.

Fri, Oct 15, 10:02 AM · Observability-Metrics, Infrastructure-Foundations, User-jbond, CAS-SSO, SRE

Thu, Oct 14

MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Thu, Oct 14, 4:36 PM · Infrastructure-Foundations, SRE

Wed, Oct 13

MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Wed, Oct 13, 11:59 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T293186: Update to CAS 6.4.
Wed, Oct 13, 8:28 AM · SRE, Infrastructure-Foundations, CAS-SSO

Tue, Oct 12

MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 5:11 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 5:09 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 3:21 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 3:06 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 1:52 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 1:50 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Tue, Oct 12, 1:31 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Tue, Oct 12, 12:20 PM · Infrastructure-Foundations, SRE

Mon, Oct 11

MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 2:36 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 2:34 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 2:32 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 2:12 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 1:52 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a project to T292942: Migrate OpenLDAP to MDB backend: LDAP.
Mon, Oct 11, 9:32 AM · LDAP, Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 9:20 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Mon, Oct 11, 8:43 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Mon, Oct 11, 8:43 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 8:39 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292844: Integrate Bullseye 11.1 point update.
Mon, Oct 11, 8:38 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Mon, Oct 11, 8:22 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff updated the task description for T292838: Integrate Buster 10.11 point update.
Mon, Oct 11, 8:21 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T292289: Toolforge mono version on stretch grid doesn't trust latest LE certs.

In the latest upload of ca-certificates to Debian unstable, the old X3 cert has now been removed:
https://tracker.debian.org/news/1265630/accepted-ca-certificates-20211004-source-into-unstable/

Mon, Oct 11, 8:17 AM · Cloud-VPS, Toolforge, cloud-services-team (Kanban)
MoritzMuehlenhoff created T292942: Migrate OpenLDAP to MDB backend.
Mon, Oct 11, 8:00 AM · LDAP, Infrastructure-Foundations, SRE
MoritzMuehlenhoff closed T164456: Migrate to nginx-light as Resolved.

Everything that doesn't need features from -extras or -full has been migrated to -light.

Mon, Oct 11, 7:26 AM · Patch-For-Review, User-jbond, User-MoritzMuehlenhoff, User-ArielGlenn, SRE

Fri, Oct 8

MoritzMuehlenhoff triaged T292844: Integrate Bullseye 11.1 point update as Medium priority.
Fri, Oct 8, 1:41 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T292844: Integrate Bullseye 11.1 point update.
Fri, Oct 8, 1:41 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff triaged T292838: Integrate Buster 10.11 point update as Medium priority.
Fri, Oct 8, 1:03 PM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T292838: Integrate Buster 10.11 point update.
Fri, Oct 8, 1:02 PM · Infrastructure-Foundations, SRE

Tue, Oct 5

MoritzMuehlenhoff added a comment to T292503: Rebuild Routinator (rpki) VMs with larger disk.

I've added routinator to apt.wikimedia.org at "thirdparty/routinator" for bullseye-wikimedia and adapted the Puppet code, so that when the these get reinstalled with Bullseye, the thirdparty component is picked.

Tue, Oct 5, 4:35 PM · SRE, Infrastructure-Foundations, netops
MoritzMuehlenhoff added a comment to T283165: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain.

With T291458 done, I 've already rebuilt bullseye (which was not affected) and buster main images (with libgnutls30 3.6.7-4+deb10u7) so I think the base layers are done.

I 'll delete docker-registry.wikimedia.org/wikimedia/mediawiki-services-graphoid:2019-06-10-060747-production as graphoid is no longer around

I 'll also rebuild

  • docker-registry.wikimedia.org/nodejs-slim:0.0.2-20210912
  • docker-registry.wikimedia.org/ruby:0.0.2-20210912
Tue, Oct 5, 4:30 PM · Patch-For-Review, Infrastructure-Foundations, SRE, Traffic
MoritzMuehlenhoff added a comment to T292503: Rebuild Routinator (rpki) VMs with larger disk.

@ayounsi Riccardo suggested maybe using a separate disk/partition for the routinator data? That was partly to just do a quick dirty job and not rebuild, but we've reason to rebuild anyway so let's do that.

Do you think it would still make sense to have a separate disk/partition for the Routinator data?

Tue, Oct 5, 10:18 AM · SRE, Infrastructure-Foundations, netops
MoritzMuehlenhoff added a comment to T292503: Rebuild Routinator (rpki) VMs with larger disk.

https://packages.nlnetlabs.nl/ also provides the routinator debs for bullseye (plus it's a static Go binary anyway), so if we're recreating the VMs anyway, let's also switch to Bullseye?

Tue, Oct 5, 9:51 AM · SRE, Infrastructure-Foundations, netops
MoritzMuehlenhoff added a comment to T291387: Ensure Cloud Services platforms will accept new LE issuance chain.

Hi there, just wanted to share that I worked around this issue in the py2 web situation by switching to PyOpenSSL, which brings along a newer version of OpenSSL. The changes were pretty minimal and can be seen here: https://github.com/hatnote/montage/commit/1be5d09ff5b80e2a57eb71802096fc1fcb98e60f

More papertrail available here.

A technical detail which may be of some help: The Python on the Jessie image we were using was linking against OpenSSL 1.0.0, even though 1.0.2 was available, but openssl-dev appears to have been removed from the Wikimedia apt repo, so it was nontrivial to rebuild against the newer SSL.

Tue, Oct 5, 7:35 AM · PAWS, Cloud-VPS, Toolforge, cloud-services-team (Kanban)

Mon, Oct 4

MoritzMuehlenhoff added a comment to T291387: Ensure Cloud Services platforms will accept new LE issuance chain.

"may be affected" I should have said on buster.

Mon, Oct 4, 7:21 AM · PAWS, Cloud-VPS, Toolforge, cloud-services-team (Kanban)

Fri, Oct 1

herron awarded T286911: Upgrade MXes to Bullseye a Party Time token.
Fri, Oct 1, 7:13 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff added a comment to T292289: Toolforge mono version on stretch grid doesn't trust latest LE certs.

Related is https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=995432 with a comment by our own @BBlack. From chat seen on IRC I think @BBlack is working on some Puppet code to remove the DST Root CA X3 cert from the system trust store for WMF servers.

Fri, Oct 1, 3:14 PM · Cloud-VPS, Toolforge, cloud-services-team (Kanban)

Thu, Sep 30

MoritzMuehlenhoff closed T286911: Upgrade MXes to Bullseye as Resolved.

mx1001/mx2001 have been reimaged to Bullseye (reusing the VM/IP for potential IP reputation issues).

Thu, Sep 30, 12:21 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff added a comment to T287567: Znuny/OTRS security issues (CVE-2021-36092, CVE-2021-36091, CVE-2021-21443, CVE-2021-21440).

Znuny 6.0.36 fixes the following issues: (https://www.znuny.org/en/releases/znuny-6-0-36)

Thu, Sep 30, 12:03 PM · Znuny, Security

Tue, Sep 28

MoritzMuehlenhoff closed T275873: Prepare our base system layer for Debian 11/bullseye as Resolved.

Bullseye preparations have completed and it's in active use, closing. For future migration tracking, T291916 can be used.

Tue, Sep 28, 10:59 AM · Patch-For-Review, SRE
MoritzMuehlenhoff triaged T291916: Tracking task for Bullseye migrations in production as Medium priority.
Tue, Sep 28, 10:59 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a parent task for T288804: Move the Data Engineering infrastructure to Debian Bullseye: T291916: Tracking task for Bullseye migrations in production.
Tue, Sep 28, 10:58 AM · Analytics-Clusters
MoritzMuehlenhoff added a subtask for T291916: Tracking task for Bullseye migrations in production: T288804: Move the Data Engineering infrastructure to Debian Bullseye.
Tue, Sep 28, 10:58 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a parent task for T289135: Upgrade Cirrus Elasticsearch clusters to Debian Bullseye: T291916: Tracking task for Bullseye migrations in production.
Tue, Sep 28, 10:57 AM · Discovery-Search, SRE
MoritzMuehlenhoff added a subtask for T291916: Tracking task for Bullseye migrations in production: T289135: Upgrade Cirrus Elasticsearch clusters to Debian Bullseye.
Tue, Sep 28, 10:57 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff created T291916: Tracking task for Bullseye migrations in production.
Tue, Sep 28, 10:56 AM · Infrastructure-Foundations, SRE
MoritzMuehlenhoff added a comment to T289624: Q1: (Need By: TBD) rack/setup/install centrallog2002.codfw.wmnet.

@MoritzMuehlenhoff I don't know is you saw my comment on Sep 10th but i am having issue installing Bullseye. I am getting the error below

Failed to load ldlinux.c32
Boot failed: press a key to retry, or wait for reset...

Thanks

Tue, Sep 28, 8:41 AM · SRE, SRE Observability (FY2021/2022-Q1), ops-codfw, DC-Ops

Mon, Sep 27

herron awarded T288028: Remove the "Long running screen/tmux" Icinga check a Party Time token.
Mon, Sep 27, 1:54 PM · Observability-Alerting, Patch-For-Review, SRE
lmata awarded T288028: Remove the "Long running screen/tmux" Icinga check a Like token.
Mon, Sep 27, 1:37 PM · Observability-Alerting, Patch-For-Review, SRE
MoritzMuehlenhoff closed T288028: Remove the "Long running screen/tmux" Icinga check as Resolved.

Check is now gone.

Mon, Sep 27, 10:29 AM · Observability-Alerting, Patch-For-Review, SRE
MoritzMuehlenhoff added a comment to T268985: Improve user experience for Kerberos by creating automatic token renewal service.

Looking at /var/log/installer it seems stat1005 was installed in 2019 with Stretch and then later on dist-upgraded to buster (something we rarely do since we prefer reimages, but it happens). Installing usrmerge in this case (and let's check whether other stat* hpsts have the same issue) sounds good to me.

Mon, Sep 27, 10:27 AM · Patch-For-Review, Analytics-Kanban, User-MoritzMuehlenhoff, Analytics-Clusters

Fri, Sep 24

MoritzMuehlenhoff added a comment to T286911: Upgrade MXes to Bullseye.

The two VMs (mx1002/mx2002) which were used to test the Bullseye setup have been taken down.

Fri, Sep 24, 10:50 AM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail

Thu, Sep 23

MoritzMuehlenhoff added a comment to T286911: Upgrade MXes to Bullseye.

Both mx1001 and mx2001 are now running Bullseye. There's a little cleanup/followup work, but the core of the work is completed.

Thu, Sep 23, 2:37 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff added a comment to T290982: Support expired tile deduplication.

I can easily rebuild/upload a fixed package for apt.wikimedia.org, though. Just let me know.

Thu, Sep 23, 12:52 PM · Product-Infrastructure-Team-Backlog (Kanban), Maps

Tue, Sep 21

MoritzMuehlenhoff added a comment to T291387: Ensure Cloud Services platforms will accept new LE issuance chain.

The expected version numbers are
openssl1.0: 1.0.2u-1~deb9u5
gnutls28: 3.5.8-5+deb9u6

Tue, Sep 21, 8:14 AM · PAWS, Cloud-VPS, Toolforge, cloud-services-team (Kanban)
MoritzMuehlenhoff added a comment to T291425: Rebuild CI images affected by OpenSSL compat issue with new Let's Encrypt issuance chain.

The expected version numbers are
openssl1.0: 1.0.2u-1~deb9u5
gnutls28: 3.5.8-5+deb9u6

Tue, Sep 21, 8:14 AM · Patch-For-Review, Release-Engineering-Team (Done by Wed 06 Oct), Continuous-Integration-Config
MoritzMuehlenhoff created T291458: Rebuild production Stretch images with GNUTLS/OpenSSL updates for LE issue chain update.
Tue, Sep 21, 8:13 AM · Patch-For-Review, serviceops, SRE

Mon, Sep 20

MoritzMuehlenhoff added a comment to T283714: Python 3's eventlet.green getaddrinfo timeout in Bullseye.

I was able to get a working python3-eventlet package by integrating upstream PR, the easy solution for now IMHO is to upload the package internally for Bullseye.

Mon, Sep 20, 7:17 PM · Patch-For-Review, SRE, User-fgiunchedi, SRE-swift-storage
MoritzMuehlenhoff added a comment to T290982: Support expired tile deduplication.

Script is now deployed on the masters

Mon, Sep 20, 3:37 PM · Product-Infrastructure-Team-Backlog (Kanban), Maps
MoritzMuehlenhoff updated the task description for T290982: Support expired tile deduplication.
Mon, Sep 20, 3:37 PM · Product-Infrastructure-Team-Backlog (Kanban), Maps
MoritzMuehlenhoff added a comment to T283165: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain.

For production:

  • OpenSSL in Buster and Bullseye is not affected (only ship OpenSSL 1.1)
  • OpenSSL updates for openssl 1.0.2 in Stretch have been rolled out
  • GNUTLS in Bullseye is not affected
  • GNUTLS in Buster was already fixed in Buster 10.10 (rolled out via T285206)
  • GNUTLS updates for Stretch have been rolled out
Mon, Sep 20, 2:28 PM · Patch-For-Review, Infrastructure-Foundations, SRE, Traffic

Sep 20 2021

MoritzMuehlenhoff added a comment to T290982: Support expired tile deduplication.

Should be fixed here after this change.

Sep 20 2021, 8:00 AM · Product-Infrastructure-Team-Backlog (Kanban), Maps
MoritzMuehlenhoff created T291353: Check home/HDFS leftovers of mholloway-shell.
Sep 20 2021, 6:36 AM · Analytics-Kanban, Analytics

Sep 17 2021

MoritzMuehlenhoff added a comment to T290982: Support expired tile deduplication.

@Jgiannelos One of the tests fails with Python 3.7 (the Python version in Buster):

Sep 17 2021, 12:40 PM · Product-Infrastructure-Team-Backlog (Kanban), Maps
MoritzMuehlenhoff added a comment to T291052: Deploy PHP patch for DOM replaceChild/removeChild performance.

Ack, I'll upload to apt.wikimedia.org on Monday.

Sep 17 2021, 12:06 PM · Patch-For-Review, SRE, serviceops

Sep 16 2021

MoritzMuehlenhoff added a comment to T290982: Support expired tile deduplication.

The approach of the CLI looks good to me. We should now see how to backport the script to debian buster to use on the maps clusters.

@MoritzMuehlenhoff do you have any thoughts regarding the debian packaging backport? How can we proceed with this?

Sep 16 2021, 3:37 PM · Product-Infrastructure-Team-Backlog (Kanban), Maps
MoritzMuehlenhoff added a project to T286911: Upgrade MXes to Bullseye: SRE.
Sep 16 2021, 3:17 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff added a comment to T286911: Upgrade MXes to Bullseye.

Status update: mx2001 is reimaged to Bullseye and working fine so far. The smart hosts config on our servers has been switched to prefer mx2001 over mx1001 and the MX records of a handful of lesser used domains now point to mx2001.
If there's no further issues, the remaining DNS records will be updated on Monday and following that mx1001 will be reimaged some time mid next week.

Sep 16 2021, 3:17 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff added a comment to T291052: Deploy PHP patch for DOM replaceChild/removeChild performance.

scandium has been upgraded. If tests are fine, I'd upload to apt.wikimedia.org

Sep 16 2021, 8:05 AM · Patch-For-Review, SRE, serviceops
MoritzMuehlenhoff added a comment to T199911: Systemd session creation fails under I/O load.

Since the Thanos hosts run Buster and a more recent kernel/glibc/systemd, I disabled the cleanup cron job on these hosts, so that we can check whether this got fixed. If Buster is still affected we can add the cron job back.

Sep 16 2021, 7:55 AM · Infrastructure-Foundations, SRE, SRE-tools
MoritzMuehlenhoff added a comment to T290766: Requesting access to analytics-privatedata-users for Michael Raish (Design Strategy).

Hi @cmooney , actually I just checked again (80 minutes later) and I actually do have the access I need now. Maybe it took a while for everything to fall into place?

Sep 16 2021, 7:40 AM · SRE, SRE-Access-Requests

Sep 15 2021

MoritzMuehlenhoff added a comment to T290766: Requesting access to analytics-privatedata-users for Michael Raish (Design Strategy).

Hi @cmooney , thatnks for noticing that. Yes, the 'mraish' account was set up when I was still contracting, and I set up the 'Mikeraish' when I converted and linked to my wmf email. It would be great to remove the original 'mraish' account and add access to 'Mikeraish' as you suggested. I just signed in to the old account looking for a way to delete it, but I wasn't able to find one, however. Should this deletion ideally come from your end or from mine?

Sep 15 2021, 3:20 PM · SRE, SRE-Access-Requests
MoritzMuehlenhoff added a project to T291052: Deploy PHP patch for DOM replaceChild/removeChild performance: SRE.
Sep 15 2021, 2:42 PM · Patch-For-Review, SRE, serviceops
MoritzMuehlenhoff added a comment to T291052: Deploy PHP patch for DOM replaceChild/removeChild performance.

Sure thing, I'll upgrade scandium tomorrow morning then.

Sep 15 2021, 2:41 PM · Patch-For-Review, SRE, serviceops
MoritzMuehlenhoff updated subscribers of T291052: Deploy PHP patch for DOM replaceChild/removeChild performance.

I've made an updated PHP 7.2 package with a 7.2 backport of https://github.com/php/php-src/commit/781e6b4d214012e9b9c0cf96a239cdf9f948da91

Sep 15 2021, 1:32 PM · Patch-For-Review, SRE, serviceops
MoritzMuehlenhoff added a comment to T290984: error while resolving custom fact "lldp_neighbors" on ms-be105[1-9], ms-be205[1-6] and relforge100[3-4].

That page mentions that at least firmware version NVM 6.01 (for the NIC) and a current driver version are required. According to ethtool, the X710 in ms-be1051 has firmware 6.8 which should be ok. But it doesn't show the lldp disable option when I run the ethtool "-show-priv-flags" command:

Sep 15 2021, 10:15 AM · Infrastructure-Foundations, Puppet
MoritzMuehlenhoff added a comment to T290984: error while resolving custom fact "lldp_neighbors" on ms-be105[1-9], ms-be205[1-6] and relforge100[3-4].
  • Decide on a way to have this done at boot-time for affected hosts.
    • That also involves working out how to deal with this via automation, a difficulty is identifying hosts using the affected Intel NIC, and the PCI ID of the affected interface on each (which is part of the path the command gets echoed to).
Sep 15 2021, 10:12 AM · Infrastructure-Foundations, Puppet
MoritzMuehlenhoff created T291060: Check home/HDFS leftovers of kaywong.
Sep 15 2021, 7:55 AM · Analytics-Kanban, Analytics

Sep 14 2021

MoritzMuehlenhoff updated the task description for T210704: Migrate node-based services in production to node10.
Sep 14 2021, 12:39 PM · Patch-For-Review, Platform Team Initiatives (Containerise Services), serviceops, SRE
MoritzMuehlenhoff updated the task description for T210704: Migrate node-based services in production to node10.
Sep 14 2021, 12:39 PM · Patch-For-Review, Platform Team Initiatives (Containerise Services), serviceops, SRE

Sep 13 2021

MoritzMuehlenhoff added a comment to T286911: Upgrade MXes to Bullseye.

mx2001 is now filtered on the routers, in case there are any issues, this can be reverted by merging https://gerrit.wikimedia.org/r/720783 and running 'homer "cr*" merge' on cumin2002.

Sep 13 2021, 4:09 PM · SRE, Patch-For-Review, Infrastructure-Foundations, Mail
MoritzMuehlenhoff created P17267 (An Untitled Masterwork).
Sep 13 2021, 3:27 PM
MoritzMuehlenhoff added a comment to T210704: Migrate node-based services in production to node10.

Not sure why restbase is ticked off, though? The restbase hosts in production currently run nodejs 6.11 still.

Sep 13 2021, 2:12 PM · Patch-For-Review, Platform Team Initiatives (Containerise Services), serviceops, SRE
MoritzMuehlenhoff added a comment to T289779: Create a new ldap group for sre users without root access.

@MoritzMuehlenhoff i created the new sre-admins ldap group manually as i couldn't see a puppet way. pinging incase i missed something.

Sep 13 2021, 7:12 AM · User-jbond, LDAP, Infrastructure-Foundations, Security

Sep 10 2021

MoritzMuehlenhoff added a comment to T277739: rsyslog-kubernetes missing in buster-wikimedia.

Bullseye is out and there is not rsyslog-kubernetes in it, maybe we could start working with upstream to have it in unstable first and possibly in backports?

Sep 10 2021, 1:26 PM · SRE, observability
MoritzMuehlenhoff added a comment to T283165: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain.

As mentioned on the issue description, debian backported the fix for OpenSSL as it can be seen on a current debian jessie container:

root@69310d82543d:~# cat /etc/debian_version 
8.11
root@69310d82543d:~# openssl version
OpenSSL 1.0.1t  3 May 2016
root@69310d82543d:~# openssl verify -CAfile rsa-2048.chain.crt rsa-2048.crt 
rsa-2048.crt: OK
root@69310d82543d:~# openssl x509 -dates -noout -in rsa-2048.crt 
notBefore=May 10 13:15:07 2021 GMT
notAfter=Aug  8 13:15:07 2021 GMT
Sep 10 2021, 11:36 AM · Patch-For-Review, Infrastructure-Foundations, SRE, Traffic
MoritzMuehlenhoff added projects to T283165: OpenSSL < 1.1.0 compatibility issues with new LE issuance chain: SRE, Infrastructure-Foundations.
Sep 10 2021, 11:33 AM · Patch-For-Review, Infrastructure-Foundations, SRE, Traffic
MoritzMuehlenhoff added a comment to T276589: migrate services from cumin2001 to cumin2002.

If it's not too much trouble, it would be nice if cumin2001 could have a MOTD pointing you to cumin2002. If you accidentally log into cumin2001 you'll end up trying to run cookbooks that haven't been updated since May :/

Sep 10 2021, 9:43 AM · Patch-For-Review, SRE
MoritzMuehlenhoff created T290715: Check home/HDFS leftovers of jmads.
Sep 10 2021, 7:45 AM · Analytics-Kanban, Analytics

Sep 9 2021

MoritzMuehlenhoff added a comment to T286905: Add logout.d script for Gerrit.

Adding this functionality goes a little beyond the scope of the logout.d scripts I think. Right now running these scripts is fully idempotent and every logout action really only log outs, while this would actually modify account state.

Sep 9 2021, 2:55 PM · Release-Engineering-Team (Radar), Patch-For-Review, Gerrit, Infrastructure-Foundations, User-jbond, CAS-SSO, SRE