MoritzMuehlenhoff (Moritz Mühlenhoff)
User

Projects

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Thursday

  • Clear sailing ahead.

User Details

User Since
Apr 1 2015, 4:33 PM (202 w, 5 d)
Availability
Available
LDAP User
Moritz Mühlenhoff
MediaWiki User
MMuhlenhoff (WMF) [ Global Accounts ]

Recent Activity

Today

MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Tue, Feb 19, 12:16 PM · Operations
MoritzMuehlenhoff updated subscribers of T216493: Proton fails with Chromium 72.
Tue, Feb 19, 11:42 AM · Proton, Operations
MoritzMuehlenhoff added a comment to T213366: [2 hrs] Decide on handling system updates for Proton.

@MoritzMuehlenhoff thanks, that's good to know. How would the process look when the new Chromium release is not compatible with the Puppeteer version used in Proton? Would there be some time frame within which Proton is expected to be updated, or having an outdated Chromium version for a longer time is not really an issue?

What would be the process for upgrading when the new Chromium is not compatible with the old Puppeteer version and the new Puppeteer version is not compatible with the old Chromium (so the Proton code deployment and the Chromium package update would have to be more or less simultaneous)?

Tue, Feb 19, 11:41 AM · Reading-Infrastructure-Team-Backlog (Kanban), Security-Team, Operations, Proton
MoritzMuehlenhoff triaged T216493: Proton fails with Chromium 72 as High priority.
Tue, Feb 19, 11:39 AM · Proton, Operations
MoritzMuehlenhoff added a comment to T216493: Proton fails with Chromium 72.

To exclude firejail as a source of error, I disabled puppet on deployment-chromium01, remove Firejail from the service unit and restarted proton.service, same effect, Proton still fails:

Tue, Feb 19, 11:08 AM · Proton, Operations
MoritzMuehlenhoff created T216493: Proton fails with Chromium 72.
Tue, Feb 19, 10:48 AM · Proton, Operations
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

What do you think about opening a GH issue to ROCm first to (hopefully) get some feedback?

Tue, Feb 19, 9:06 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff updated subscribers of T216425: Volunteer NDA for AWight.

@RStallman-legalteam from Legal handles those, I'm adding her to the task.

Tue, Feb 19, 8:27 AM · WMF-NDA-Requests
MoritzMuehlenhoff added a comment to T216273: New cronspam from db clusters.

Sounds good, on db2085 there's been no further occasion after the reboot.

Tue, Feb 19, 8:03 AM · Operations

Yesterday

MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 2:44 PM · Operations
MoritzMuehlenhoff added a comment to T209260: Integrate Stretch 9.6 point update.

These updates have been fully deployed:

Mon, Feb 18, 12:48 PM · Operations
MoritzMuehlenhoff closed T199670: Integrate Stretch 9.5 point release as Resolved.

These updates have been fully deployed:

Mon, Feb 18, 11:44 AM · Operations
MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 11:40 AM · Operations
MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 11:17 AM · Operations
MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 11:11 AM · Operations
MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 11:03 AM · Operations
MoritzMuehlenhoff added a comment to T216273: New cronspam from db clusters.

db2085 has been rebooted - let's see if that stops the amount of emails.

Mon, Feb 18, 10:08 AM · Operations
MoritzMuehlenhoff added a comment to T216384: Integrate Stretch 9.8 point update.

These packages are not used in our production infrastructure:

  • arc
  • astroml-addons
  • chkrootkit
  • compactheader
  • courier
  • debian-edu-config
  • debian-installer
  • debian-installer-netboot-images
  • debian-security-support
  • egg
  • espeakup
  • freerdp
  • ganeti-os-noop
  • gnulib
  • graphite-api
  • grokmirror
  • gvrng
  • ibus
  • icinga2
  • isort
  • jdupes
  • kmodpy
  • libb2
  • libgpod
  • linux-igd
  • lttng-modules
  • mistral
  • monkeysign
  • mpqc
  • nvidia-graphics-drivers
  • nvidia-modprobe
  • nvidia-persistenced
  • nvidia-settings
  • nvidia-xconfig
  • openni2
  • openvpn
  • parsedatetime
  • photocollage
  • postfix
  • postgrey
  • pylint-django
  • python-arpy
  • python-certbot
  • python-certbot-apache
  • python-certbot-nginx
  • python-hypothesis
  • pyzo
  • r-cran-readxl
  • rtkit
  • sl-modem
  • sogo-connector
  • sox
  • ssh-agent-filter
  • supercollider
  • sympa
  • uglifyjs
Mon, Feb 18, 9:21 AM · Operations
MoritzMuehlenhoff updated the task description for T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 9:21 AM · Operations
MoritzMuehlenhoff created T216384: Integrate Stretch 9.8 point update.
Mon, Feb 18, 8:28 AM · Operations
MoritzMuehlenhoff created T216380: Remove trusty-specific hacks from logstash_checker.py.
Mon, Feb 18, 8:09 AM · Release-Engineering-Team (Kanban), Operations, Scap

Fri, Feb 15

MoritzMuehlenhoff closed T216076: PHP does not find mysqli on tools-sgebastion-07 as Resolved.

I'm puzzled as why reprepro doesn't pick that up, passing --noskipold didn't help either. For reason the source package got imported, but none of the binary packages.

Fri, Feb 15, 9:33 AM · Patch-For-Review, Toolforge

Thu, Feb 14

MoritzMuehlenhoff closed T216076: PHP does not find mysqli on tools-sgebastion-07 as Resolved.

Should be fixed now? JFTR, the command to run afterwards is

Thu, Feb 14, 5:15 PM · Patch-For-Review, Toolforge
MoritzMuehlenhoff added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

JFTR, the next Stretch update (this weekend) will update the kernel to 4.9.144-2, so that can be piggybacked.

Thu, Feb 14, 4:25 PM · ops-codfw, Patch-For-Review, Operations, DBA
MoritzMuehlenhoff added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

Are there other servers of that batch beside db1106 and db2085?

Thu, Feb 14, 3:46 PM · ops-codfw, Patch-For-Review, Operations, DBA
MoritzMuehlenhoff added a comment to T216076: PHP does not find mysqli on tools-sgebastion-07.

I've imported pcre and libzip and that fixes the php72 component. There's still a Toolforge-specific issue, though: xdebug now depends on php-common, so it also needs to be imported to the thirdparty/php72 repo. Can someone from WMCS take this? Fix is similar to my patch from https://gerrit.wikimedia.org/r/c/operations/puppet/+/490579/, needs to also cover php-defaults.

Thu, Feb 14, 2:32 PM · Patch-For-Review, Toolforge
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

https://rocm.github.io/ROCmInstall.html#supported-gpus should serve as a useful enough base to select a new GPU I guess (we'll need to figure out what stat1005 supports, though)

Thu, Feb 14, 11:43 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.
  1. the null pointer is due to some code for the Hawaii GPU cards (like ours), so not supported by upstream. In this case, I'd propose to buy another supported AMD GPU card and restart from there (we need to verify some hw things like CPU supporting PCIe atomics and also PCIe x16 bus available of the target host, hopefully stat1005 will be ok).
Thu, Feb 14, 11:36 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management

Wed, Feb 13

MoritzMuehlenhoff added a comment to T216043: Sort out which RAID packages are still needed.

Yeah, I know, that one correctly pulls in a number of debs which are actually in puppet, but there's a number of additional ones which need a closer look (e.g. hpacucli or hpssa which are nowhere used).

Wed, Feb 13, 2:45 PM · Operations
MoritzMuehlenhoff created T216043: Sort out which RAID packages are still needed.
Wed, Feb 13, 2:41 PM · Operations
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

Installed 4.20 from experimental but it seems that the kfd driver is not shipped:

elukey@stat1005:~$ find /lib/modules/ -type f -name '*.ko' | grep kfd
/lib/modules/4.19.0-2-amd64/kernel/drivers/gpu/drm/amd/amdkfd/amdkfd.ko
elukey@stat1005:~$ uname -a
Linux stat1005 4.20.0-trunk-amd64 #1 SMP Debian 4.20-1~exp1 (2018-12-24) x86_64 GNU/Linux

So testing is currently on hold until we find a way to circumvent this.

Wed, Feb 13, 11:40 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

Maybe try 4.20-1 from experimental to narrow the kernel oops down?

Wed, Feb 13, 10:41 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

Tensorflow is also finding it's way into Debian, BTW (currently only in experimental): https://packages.qa.debian.org/t/tensorflow.html

Wed, Feb 13, 7:55 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management

Tue, Feb 12

MoritzMuehlenhoff added a comment to T188122: Specific PDF file only displays completely white page previews.

JFTR, the recent update Ghostscript update to 9.26 also switched the JPEG2000 library from Jasper to OpenJPEG (current Ghostscript releases no longer support Jasper), so that might have also had an effect.

Tue, Feb 12, 2:15 PM · Multimedia, MediaWiki-extensions-PdfHandler, MediaWiki-File-management, Commons
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

we should ask them to also publish them.

Tue, Feb 12, 12:49 PM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

There are no source packages for the debs, given that they seem are otherwise pretty focused on FLOSS (e.g. https://rocm.github.io/ROCmInstall.html#closed-source-components), that's probably just an oversight and we should ask them to also publish them.

Tue, Feb 12, 12:39 PM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff added a comment to T148843: GPU upgrade for stats machine.

TODOS:

  • stat1005 will be reimaged to Debian Stretch when the SRE team is ready (work is currently in progress to import Buster in production).
  • Luca will ask to the SRE team to create a special POSIX group to allow Erik to be root on stat1005 and experiment with the host when he will have time/patience.
Tue, Feb 12, 11:47 AM · Patch-For-Review, User-Elukey, Operations, Analytics, Research-management
MoritzMuehlenhoff closed T215384: Allow Erik Bernhardson to have root access on stat1005 for GPU testing as Resolved.

stat1005 is now running Debian buster and I've enabled Erik's access.

Tue, Feb 12, 11:46 AM · Patch-For-Review, Analytics, Operations, SRE-Access-Requests
MoritzMuehlenhoff added a comment to T186070: Thumbnails for PDF getting disfigured.

FYI, Ghostscript on the Thumbor servers got upgraded to 9.26, worth retesting.

Tue, Feb 12, 11:43 AM · media-storage, Thumbor, MediaWiki-extensions-PdfHandler
MoritzMuehlenhoff added a comment to T188122: Specific PDF file only displays completely white page previews.

FYI; the Ghostscript version on our Thumbor servers got upgraded to 9.26, this might be worth re-testing.

Tue, Feb 12, 11:39 AM · Multimedia, MediaWiki-extensions-PdfHandler, MediaWiki-File-management, Commons
MoritzMuehlenhoff closed T110849: Upgrade Ghostscript to 9.15 or later as Resolved.

Closing this old bug, we're now using ghostscript 9.26 everywhere. If there's any specific other Ghostscript-related issue, please open a new task.

Tue, Feb 12, 11:34 AM · Operations, Wikisource, Wikimedia-General-or-Unknown
MoritzMuehlenhoff closed T110849: Upgrade Ghostscript to 9.15 or later, a subtask of T110821: PDF file entirely rendered as a set of blank pages, as Resolved.
Tue, Feb 12, 11:34 AM · Wikisource, MediaWiki-extensions-PdfHandler
MoritzMuehlenhoff closed T110849: Upgrade Ghostscript to 9.15 or later, a subtask of T50178: Error creating thumbnail: "Warning: File has insufficient data for an image.", as Resolved.
Tue, Feb 12, 11:34 AM · MediaWiki-extensions-PdfHandler

Mon, Feb 11

MoritzMuehlenhoff added a comment to T213708: Upgrade production prometheus-node-exporter to >= 0.16.

We currently pin prometheus-node-exporter to 0.17.0+ds-2 on the selected hosts and for buster, but yesterday 0.17.0+ds-3 migrated to testing/buster. I could change the puppet code to pick -3 on buster, but I'd say we upgrade the components for jessie and stretch also to -3 and bump it in general? https://packages.qa.debian.org/p/prometheus-node-exporter/news/20190131T180815Z.html lists a number of fixes and at least the TMPDIR change seems relevant as for those as well.

Mon, Feb 11, 2:21 PM · Patch-For-Review, Goal, monitoring, Operations
MoritzMuehlenhoff created T215775: Check home leftovers of ISI researchers.
Mon, Feb 11, 10:28 AM · Research, Analytics
MoritzMuehlenhoff created T215758: Free up disk space on labmon1001.
Mon, Feb 11, 8:20 AM · Cloud-Services
MoritzMuehlenhoff reassigned T215569: mw1299 is down (jobrunner-canary, now up but depooled) from RobH to Cmjohnson.
Mon, Feb 11, 8:14 AM · ops-eqiad, Operations

Fri, Feb 8

MoritzMuehlenhoff added a comment to T213527: Prepare our base system layer for Debian buster.

Still some rough edges to sort out, but bare metal installations are working now:

Fri, Feb 8, 3:50 PM · Patch-For-Review, Operations
MoritzMuehlenhoff added a comment to T209029: cloudelastic1004: SMART/disk error.

The debian installer completes, but I can't log in because apparently the first puppet run isn't completed and I can't use any login methods (ssh or direct console access).

Fri, Feb 8, 2:34 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
MoritzMuehlenhoff added a comment to T214720: db1114 crashed.

The server went down at 12:16, with a number of memory errors logged in SEL:

Fri, Feb 8, 12:34 PM · Patch-For-Review, DBA, Operations, ops-eqiad
MoritzMuehlenhoff added a comment to T187987: Serve >= 50% of production Prometheus systems with Prometheus v2.

As discussed on IRC: Let's upgrade to 2.7.1 next week as that fixes a security issue (CVE-2019-3826) in the internal UI (not exposed in production, but in https://beta-prometheus.wmflabs.org/). Change is already pending in Salsa: https://salsa.debian.org/go-team/packages/prometheus/commit/1cd743bc0012935842adb5941258c9ed8bff85fe

Fri, Feb 8, 11:18 AM · Patch-For-Review, monitoring, Operations
MoritzMuehlenhoff created T215593: confd: Superfluous golang dependency.
Fri, Feb 8, 9:31 AM · Operations

Wed, Feb 6

MoritzMuehlenhoff added a comment to T212418: Memory error on restbase1016.

All of the instances have joined the ring (thnx @fgiunchedi!) and the latest version of RESTBase is in place, so we are good. There is one problem, now, though: I can't seem to be able to pool the node back. Let's try and see what this is about before resolving the ticket.

Indeed neither can I:

restbase1016:~$ pool-restbase 
restbase1016:~$ echo $?
2

Though I'm not going to have time to investigate further this week, any help is welcome

@Joe could you take a look ^ and enlighten us?

Wed, Feb 6, 11:24 AM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations

Tue, Feb 5

MoritzMuehlenhoff updated subscribers of T116011: ferm: Log dropped packets.

I have created a simple module for configuereing ulogd2 avalible in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/486513/. The questions is what do we want to log and how do we want to log it. With the default configuration the module will produce the following:

Tue, Feb 5, 2:33 PM · Patch-For-Review, Operations
MoritzMuehlenhoff added a comment to T215171: Archival of home directories on servers with very large homes.

The archival mechanism doesn't seem very robust either; e.g. for user "banyek" the home is still around on e.g. cumin2001 or puppetmaster1001.

Tue, Feb 5, 1:52 PM · Operations
MoritzMuehlenhoff added a comment to T200210: Decom graphite2002.

Please don't proceed with decom for now; I'm using graphite2002 for some buster tests.

Tue, Feb 5, 12:05 PM · decommission, monitoring, Operations, ops-codfw
MoritzMuehlenhoff added a comment to T199321: Return graphite200[12] to spares pool.

Please don't proceed with decom for now; Filippo uses graphite2001 for prometheus 2 tests and I'm using graphite2002 for some buster tests.

Tue, Feb 5, 12:05 PM · decommission, User-fgiunchedi, Operations
MoritzMuehlenhoff updated the task description for T213546: Prepare puppet for Debian buster.
Tue, Feb 5, 10:48 AM · Patch-For-Review, Packaging, Puppet, Operations

Mon, Feb 4

MoritzMuehlenhoff created T215171: Archival of home directories on servers with very large homes.
Mon, Feb 4, 3:27 PM · Operations
MoritzMuehlenhoff added a project to T215012: cloudvirt1015: apparent hardware errors in CPU/Memory: ops-eqiad.
Mon, Feb 4, 3:10 PM · Operations, ops-eqiad, DC-Ops, cloud-services-team (Kanban)
MoritzMuehlenhoff reopened T204567: ms-be2030 spontaneous reboot as "Open".

I'm reopening the task, the server went down again today:

Mon, Feb 4, 2:22 PM · ops-codfw, Operations
MoritzMuehlenhoff added a comment to T214501: Clean up home dirs for user mkroetzsch.

The user has now been removed.

Mon, Feb 4, 11:15 AM · Analytics
MoritzMuehlenhoff closed T214498: remove shell access for mkroetzsch on 2019-01-26 as Resolved.
Mon, Feb 4, 11:15 AM · Patch-For-Review, SRE-Access-Requests, Operations

Wed, Jan 30

MoritzMuehlenhoff added a comment to T214840: db2085/db1106 don't boot with 4.9.0-8-amd64.

We could narrow this down further by enabling debug flags for the initrd, I don't remember the specific options out of the top of my head, but we can look into this next week. As Manuel mentioned, my hunch is that this is a hw issues which manifests during the reboots, but which is not caused by the kernel change between -7 and -8 itself.

Wed, Jan 30, 2:43 PM · ops-codfw, Patch-For-Review, Operations, DBA

Fri, Jan 25

MoritzMuehlenhoff added a comment to T177196: Port non-deprecated Diamond collectors to Prometheus.

I think this task is mostly superseded by https://phabricator.wikimedia.org/T212231, https://phabricator.wikimedia.org/T210993 and https://phabricator.wikimedia.org/T210991, shall we close this one?

Fri, Jan 25, 3:32 PM · monitoring, cloud-services-team (Kanban), User-fgiunchedi, Goal, Operations
MoritzMuehlenhoff reassigned T116011: ferm: Log dropped packets from MoritzMuehlenhoff to jbond.
Fri, Jan 25, 2:00 PM · Patch-For-Review, Operations
MoritzMuehlenhoff added a comment to T198939: Decommission servermon.

Is anyone still using Servermon at this point?

Fri, Jan 25, 1:29 PM · Patch-For-Review, Operations

Thu, Jan 24

MoritzMuehlenhoff added a comment to T213366: [2 hrs] Decide on handling system updates for Proton.

AIUI with the current setup Puppet will upgrade Chromium whenever Debian updates their package, so the service could break without any action from our side.

Thu, Jan 24, 7:03 PM · Reading-Infrastructure-Team-Backlog (Kanban), Security-Team, Operations, Proton
MoritzMuehlenhoff edited P7680 Masterwork From Distant Lands.
Thu, Jan 24, 12:29 PM

Wed, Jan 23

MoritzMuehlenhoff created T214501: Clean up home dirs for user mkroetzsch.
Wed, Jan 23, 6:08 PM · Analytics
MoritzMuehlenhoff added a comment to T203069: Release and deploy wikidiff2 v1.8.0 with changed signature.

@MoritzMuehlenhoff We're just working on a bugfix in the 1.8.0 version. So please wait with the deployment for now. I'll ping you on this ticket again when we released the fixed version.

Wed, Jan 23, 4:27 PM · WMDE-QWERTY-Sprint-2019-01-23, WMDE-QWERTY-Sprint-2019-01-10, WMDE-QWERTY-Sprint-2018-08-29, wikidiff2, WMDE-QWERTY-Team, MediaWiki-History-and-Diffs, TCB-Team

Tue, Jan 22

MoritzMuehlenhoff closed T214368: Rebuild installer images for CVE-2019-3462 as Resolved.

We've looked into this; our netboot images don't need an update: In the initrd anna is used instead of apt and it's not affected by CVE-2019-3462.

Tue, Jan 22, 4:29 PM · Operations
MoritzMuehlenhoff created T214369: Update OpenStack images for jessie/stretch for CVE-2019-3462.
Tue, Jan 22, 12:20 PM · cloud-services-team (Kanban), Operations, Cloud-Services
MoritzMuehlenhoff updated subscribers of T214368: Rebuild installer images for CVE-2019-3462.
Tue, Jan 22, 12:19 PM · Operations
MoritzMuehlenhoff created T214368: Rebuild installer images for CVE-2019-3462.
Tue, Jan 22, 12:19 PM · Operations

Mon, Jan 21

MoritzMuehlenhoff added a comment to T214314: wmf-auto-reimage-host: icinga downtime error.

The FQDN where that server is being renamed to doesn't exist here yet, so it should simply skipped when setting downtime?

Mon, Jan 21, 4:42 PM · Operations

Sun, Jan 20

Eevans awarded T212418: Memory error on restbase1016 a Cookie token.
Sun, Jan 20, 7:06 PM · Patch-For-Review, Core Platform Team Backlog (Watching / External), Services (watching), RESTBase-Cassandra, RESTBase, Operations

Jan 19 2019

Legoktm awarded T213527: Prepare our base system layer for Debian buster a Party Time token.
Jan 19 2019, 4:35 AM · Patch-For-Review, Operations

Jan 18 2019

MoritzMuehlenhoff added a comment to T214153: Fix node vs nodejs dependency issue.

True that, also note that in the nodejs 10 packages (from component/node10), the nodejs-legacy package is gone. Debian dropped it, we could patch it back in, but it probably makes sense to fix this mid-term.

Jan 18 2019, 10:35 AM · Reading-Infrastructure-Team-Backlog, Operations, Maps
MoritzMuehlenhoff added a comment to T214153: Fix node vs nodejs dependency issue.

if you install the nodejs-legacy package, it will provide a symlink from node to nodejs.

Jan 18 2019, 10:18 AM · Reading-Infrastructure-Team-Backlog, Operations, Maps

Jan 17 2019

MoritzMuehlenhoff created T214024: Two test hosts for SREs.
Jan 17 2019, 11:44 AM · Operations, hardware-requests
MoritzMuehlenhoff updated the task description for T212231: Remove Diamond from production.
Jan 17 2019, 10:14 AM · Patch-For-Review, monitoring, Operations
MoritzMuehlenhoff updated the task description for T213859: eqiad: rack a3 pdu swap / failure / replacement.
Jan 17 2019, 9:29 AM · Patch-For-Review, ops-eqiad, Operations

Jan 16 2019

MoritzMuehlenhoff updated the task description for T213703: Offboard Balazs.
Jan 16 2019, 8:44 AM · Operations
MoritzMuehlenhoff added a comment to T213703: Offboard Balazs.

I've removed Balazs from pwstore.

Jan 16 2019, 8:44 AM · Operations
MoritzMuehlenhoff updated the task description for T213859: eqiad: rack a3 pdu swap / failure / replacement.
Jan 16 2019, 8:21 AM · Patch-For-Review, ops-eqiad, Operations
MoritzMuehlenhoff updated the task description for T203861: decom radium.
Jan 16 2019, 8:20 AM · Patch-For-Review, ops-eqiad, decommission, Operations
MoritzMuehlenhoff updated the task description for T213859: eqiad: rack a3 pdu swap / failure / replacement.
Jan 16 2019, 8:07 AM · Patch-For-Review, ops-eqiad, Operations

Jan 15 2019

MoritzMuehlenhoff updated the task description for T213703: Offboard Balazs.
Jan 15 2019, 2:50 PM · Operations
MoritzMuehlenhoff updated the task description for T213703: Offboard Balazs.
Jan 15 2019, 1:48 PM · Operations
MoritzMuehlenhoff reassigned T207845: debdeploy: show help message if invoked with no arguments from MoritzMuehlenhoff to jbond.
Jan 15 2019, 12:40 PM · Patch-For-Review, Operations, Operations-Software-Development
MoritzMuehlenhoff added a comment to T203194: cp1075-90 - bnxt_en transmit hangs.

The reports in that thread are for RHEL 7, which uses 3.10 as the base layer kernel (but with backports for all kinds of drivers, so it's hard to tell how that maps to out 4.9 kernel. One thing we could try is to test the 4.19.12-1~bpo9+1 kernel from stretch-backports. If it still fails in that version, we can easily report it to the upstream maintainers given that 4.19 is the latest LTS branch. Or we point Dell to the thread and ask them them swap the NICs to a known working 10G card.

Jan 15 2019, 8:01 AM · Patch-For-Review, Operations, Traffic

Jan 14 2019

MoritzMuehlenhoff assigned T213703: Offboard Balazs to jbond.
Jan 14 2019, 2:12 PM · Operations
MoritzMuehlenhoff created T213703: Offboard Balazs.
Jan 14 2019, 2:12 PM · Operations

Jan 11 2019

MoritzMuehlenhoff created T213546: Prepare puppet for Debian buster.
Jan 11 2019, 3:07 PM · Patch-For-Review, Packaging, Puppet, Operations
MoritzMuehlenhoff closed T213079: Onboarding John Bond as Resolved.
Jan 11 2019, 1:26 PM · Patch-For-Review, Operations
MoritzMuehlenhoff updated the task description for T213079: Onboarding John Bond.
Jan 11 2019, 12:30 PM · Patch-For-Review, Operations
MoritzMuehlenhoff updated the task description for T213079: Onboarding John Bond.
Jan 11 2019, 11:18 AM · Patch-For-Review, Operations
MoritzMuehlenhoff added a comment to T203194: cp1075-90 - bnxt_en transmit hangs.

The 4.9.144-1 kernel is fully production-ready, the point releases for Debian are used to rebase the Stretch kernel to the latest set of 4.9.x bug fixes (although depending on the final date for Stretch 9.7 there might be one further update still).

Jan 11 2019, 10:35 AM · Patch-For-Review, Operations, Traffic
MoritzMuehlenhoff created T213527: Prepare our base system layer for Debian buster.
Jan 11 2019, 9:26 AM · Patch-For-Review, Operations