Page MenuHomePhabricator

Marostegui (Manuel Aróstegui)
User

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Tuesday

  • Clear sailing ahead.

User Details

User Since
Sep 1 2016, 6:48 AM (180 w, 3 d)
Availability
Available
IRC Nick
marostegui
LDAP User
Marostegui
MediaWiki User
MArostegui (WMF) [ Global Accounts ]

TZ: UTC +1/+2

Recent Activity

Today

Marostegui added a comment to T245361: prometheus1003/prometheus1004 /srv/prometheus/ops disk space warning.

For the record:

root@prometheus1003:~# pvs
  PV         VG     Fmt  Attr PSize PFree
  /dev/sda3  vg-ssd lvm2 a--  1.42t 514.16g
  /dev/sdb1  vg-hdd lvm2 a--  3.64t   2.88t
Sun, Feb 16, 11:29 AM · observability, Operations
Marostegui created T245361: prometheus1003/prometheus1004 /srv/prometheus/ops disk space warning.
Sun, Feb 16, 10:46 AM · observability, Operations
Marostegui triaged T245358: Compress table watchlist_expiry as Medium priority.
Sun, Feb 16, 10:16 AM · DBA
Marostegui created T245358: Compress table watchlist_expiry.
Sun, Feb 16, 10:16 AM · DBA

Fri, Feb 14

Marostegui triaged T243512: Clean up wikiadmin2 user from core hosts as Medium priority.
Fri, Feb 14, 2:40 PM · DBA
Marostegui added a comment to T242702: Test MariaDB 10.4 in production.

Next week I am going to start combining main traffic + API traffic, to capture some live API queries (even though they've been replayed already off-band with no problem)

Fri, Feb 14, 9:04 AM · DBA
Marostegui closed T239900: Sync understanding of MediaWiki rdbms 'weight' behaviour with DBAs as Resolved.

Ideally I would love to see old and read-only hosts (es1, es2) not being treated as a replication topology by dbctl and rather as independent hosts (as the sort of are), so we can operate with the master without having to do a "master switchover" in dbctl (@CDanis I don't know how feasible is to have that, it is certainly not something urgent, do you want me to create a task to track that?).
Also, part of the discussion has covered the fact that we can do, indeed, set weight on the master if needed.
As per the pooled replicas with zero weight, from the comments here, looks like it is allowed and we have some reason to kept that behaviour.
Anything left on this task?
Thanks everyone!

Fri, Feb 14, 6:32 AM · Core Platform Team Workboards (Clinic Duty Team), DBA, Wikimedia-Rdbms
Marostegui created T245239: dbctl: treat read only ES hosts as standalone hosts.
Fri, Feb 14, 6:31 AM · conftool
Marostegui created T245238: Remove references to m4-master.
Fri, Feb 14, 6:12 AM · Operations, Analytics
Marostegui updated the task description for T239453: Remove partitions from revision table.
Fri, Feb 14, 5:37 AM · DBA

Thu, Feb 13

Marostegui added a comment to T243808: gerrit1002 running out of space.

Per the duplicate task I merged here filled by @MoritzMuehlenhoff:

root@gerrit1002:~# df -hT /
Filesystem     Type  Size  Used Avail Use% Mounted on
/dev/vda1      ext4   63G   60G     0 100% /
Thu, Feb 13, 9:53 AM · Operations, Gerrit
Marostegui merged task T245127: Full root partition/disk on gerrit1002 into T243808: gerrit1002 running out of space.
Thu, Feb 13, 9:52 AM · Operations
Marostegui merged T245127: Full root partition/disk on gerrit1002 into T243808: gerrit1002 running out of space.
Thu, Feb 13, 9:52 AM · Operations, Gerrit
Marostegui moved T240094: Create required table for new Watchlist Expiry feature from Blocked external/Not db team to Done on the DBA board.
Thu, Feb 13, 9:28 AM · MW-1.35-notes (1.35.0-wmf.19; 2020-02-11), DBA, Community-Tech (Kanban-Q3-2019-20), Core Platform Team Workboards (Clinic Duty Team), Expiring-Watchlist-Items
Marostegui moved T202367: Productionize dbproxy101[2-7].eqiad.wmnet and dbproxy200[1-4] from Next to In progress on the DBA board.
Thu, Feb 13, 9:27 AM · DBA
Marostegui added a comment to T242702: Test MariaDB 10.4 in production.

I have given this host more weight now, it is now serving with weight 100, which is 6% of enwiki main traffic.

Thu, Feb 13, 9:03 AM · DBA
Marostegui added a comment to T244884: Implement logic to be able to perform full and incremental backups of ES hosts.

We can also take into consideration reading the binlogs from a given file/position using the coordinates we store on the logical dump, which might be faster.

Thu, Feb 13, 8:48 AM · Patch-For-Review, Goal, Operations, DBA
Marostegui added a project to T226704: Setup es4 and es5 replica sets for new read-write external store service: Goal.
Thu, Feb 13, 8:40 AM · Goal, Epic, DBA
Marostegui updated the task description for T241359: (Needed by 31st January) eqiad: rack/setup/install es102[0-5].eqiad.wmnet.
Thu, Feb 13, 7:56 AM · Patch-For-Review, DBA, ops-eqiad, Operations
Marostegui updated the task description for T243052: Productionize es1020-es1025, es2020-es2025.
Thu, Feb 13, 7:27 AM · DBA
Marostegui moved T240772: Prepare and check storage layer for ngwikimedia from Blocked external/Not db team to Done on the DBA board.

This is ready for WMCS to create the views.
I have created ngwikimedia_p and granted labsdbuser grants for it.
cloud-services-team please go ahead and create the views on labsdb1009, labsdb1010, labsdb1011 and labsdb1012

Thu, Feb 13, 7:22 AM · Data-Services, cloud-services-team (Kanban), DBA
Marostegui renamed T245107: Possibly replace db1087 (s8) with db1127 (x1) due to disk space constrains from Possibly replace db1087 (s8) with db1127 (x1) to Possibly replace db1087 (s8) with db1127 (x1) due to disk space constrains.
Thu, Feb 13, 7:08 AM · DBA
Marostegui triaged T245107: Possibly replace db1087 (s8) with db1127 (x1) due to disk space constrains as Medium priority.
Thu, Feb 13, 7:08 AM · DBA
Marostegui created T245107: Possibly replace db1087 (s8) with db1127 (x1) due to disk space constrains.
Thu, Feb 13, 7:08 AM · DBA
Marostegui updated the task description for T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes.
Thu, Feb 13, 6:43 AM · DBA, User-jbond, Puppet, Operations
Aklapper awarded T244566: Upgrade and restart m3 (phabricator) master (db1128) a Like token.
Thu, Feb 13, 6:27 AM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Marostegui updated the task description for T239453: Remove partitions from revision table.
Thu, Feb 13, 6:11 AM · DBA
Marostegui updated the task description for T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes.
Thu, Feb 13, 6:10 AM · DBA, User-jbond, Puppet, Operations
Marostegui closed T244566: Upgrade and restart m3 (phabricator) master (db1128), a subtask of T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes, as Resolved.
Thu, Feb 13, 6:10 AM · DBA, User-jbond, Puppet, Operations
Marostegui closed T244566: Upgrade and restart m3 (phabricator) master (db1128) as Resolved.

This was done successfully.
Downtime was 56 seconds:
06:01:19 - 06:02:15

Thu, Feb 13, 6:10 AM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Marostegui closed T245102: Test ticket after maintenance as Resolved.
Thu, Feb 13, 6:03 AM
Marostegui created T245102: Test ticket after maintenance.
Thu, Feb 13, 6:03 AM
Marostegui moved T244566: Upgrade and restart m3 (phabricator) master (db1128) from Next to In progress on the DBA board.
Thu, Feb 13, 5:46 AM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)

Wed, Feb 12

Marostegui added a comment to T107610: Setup separate logical External Store for Flow in production.

@jcrespo do you still want us to do the compression in T106386? Are the storage constraints still relevant?

Wed, Feb 12, 6:47 PM · Growth-Team, DBA, Operations, WorkType-Maintenance, StructuredDiscussions
Marostegui added a comment to T244566: Upgrade and restart m3 (phabricator) master (db1128).

@Marostegui Thursday 6:00 AM works for me.

Wed, Feb 12, 6:32 PM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Dzahn awarded T241109: wikibugs needs restart almost everyday a Barnstar token.
Wed, Feb 12, 6:10 PM · Operations, Wikibugs
Marostegui added a comment to T241109: wikibugs needs restart almost everyday.

Thank you!

Wed, Feb 12, 5:39 PM · Operations, Wikibugs
Marostegui added a comment to T244566: Upgrade and restart m3 (phabricator) master (db1128).

Hey @Marostegui, How about tomorrow? I can be around tomorrow, if you'd like. If you'd like to do it at your leisure, and for future reference, the command for read-only is as follows:
/srv/phab/phabricator/bin/config set cluster.read-only true
The command needs to run as root on phab1001.eqiad.wmnet
To disable read-only, obviously, just set the config to false:
/srv/phab/phabricator/bin/config set cluster.read-only false
I've also documented the above commands on wikitech, for posterity, at https://wikitech.wikimedia.org/wiki/Phabricator#read-only_mode_/_restarting_mariadb

Wed, Feb 12, 4:45 PM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Marostegui moved T244238: Upgrade and restart m1 master (db1135) from Next to In progress on the DBA board.
Wed, Feb 12, 4:40 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T243963: es1019: reseat IPMI.

@Cmjohnson I can have the host depooled and off tomorrow in the UTC morning so you can do it whenever you can tomorrow, and once done, just power it back on. Would that work?

Wed, Feb 12, 4:08 PM · DC-Ops, Operations, ops-eqiad, DBA
Marostegui added a comment to T243963: es1019: reseat IPMI.

@Marostegui We can upgrade the f/w. That can be anytime, please pick a convenient date for you.

Wed, Feb 12, 4:06 PM · DC-Ops, Operations, ops-eqiad, DBA
Marostegui updated subscribers of T245036: Enforce in dbctl that core sections and es clusters always have at least two replicas.
Wed, Feb 12, 4:03 PM · Operations, conftool
Marostegui edited projects for T245036: Enforce in dbctl that core sections and es clusters always have at least two replicas, added: Operations; removed DBA.
Wed, Feb 12, 4:03 PM · Operations, conftool
Marostegui added a comment to T245036: Enforce in dbctl that core sections and es clusters always have at least two replicas.

right now this is what we have on es config:

root@cumin1001:/home/marostegui# dbctl -s eqiad section es3 get
{
    "es3": {
        "flavor": "external",
        "master": "es1017",
        "min_replicas": 1,
        "readonly": false,
        "ro_reason": "PLACEHOLDER"
    },
    "tags": "datacenter=eqiad"
}
Wed, Feb 12, 4:02 PM · Operations, conftool
Marostegui merged task T245031: Degraded RAID on db1095 into T244958: db1095 backup source crashed: broken BBU.
Wed, Feb 12, 3:32 PM · ops-eqiad, Operations
Marostegui merged T245031: Degraded RAID on db1095 into T244958: db1095 backup source crashed: broken BBU.
Wed, Feb 12, 3:31 PM · ops-eqiad, Operations, DBA
Marostegui added a comment to T244238: Upgrade and restart m1 master (db1135).

Email: https://lists.wikimedia.org/pipermail/wikitech-l/2020-February/093063.html

Wed, Feb 12, 1:54 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T244238: Upgrade and restart m1 master (db1135).

Let's aim for Thursday 20th at 09:00AM UTC?

Cool to me, send some invites this way! :-D

Wed, Feb 12, 1:47 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T244238: Upgrade and restart m1 master (db1135).

Let's aim for Thursday 20th at 09:00AM UTC?

Wed, Feb 12, 1:44 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T244566: Upgrade and restart m3 (phabricator) master (db1128).

12th (the original date I suggested) has passed, any tentative date @mmodell you'd like to consider, there is no rush really, I just want to organize myself around this :-)
Thank you!

Wed, Feb 12, 1:41 PM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Marostegui added a comment to T244238: Upgrade and restart m1 master (db1135).

@jcrespo @akosiaris any tentative date?

Wed, Feb 12, 1:40 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T241109: wikibugs needs restart almost everyday.

I have had to restart it again just now :(

Wed, Feb 12, 1:21 PM · Operations, Wikibugs
Marostegui updated the task description for T239453: Remove partitions from revision table.
Wed, Feb 12, 1:20 PM · DBA
Marostegui moved T244884: Implement logic to be able to perform full and incremental backups of ES hosts from Triage to In progress on the DBA board.
Wed, Feb 12, 10:50 AM · Patch-For-Review, Goal, Operations, DBA
Marostegui moved T244958: db1095 backup source crashed: broken BBU from Triage to In progress on the DBA board.
Wed, Feb 12, 9:09 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T243800: gerritro user getting access denied from gerrit1002.

@Marostegui I made some changes to make the db_user and db_pass configurable for gerrit. Thing is just i don't know the clear text version of the hashed password for 'gerritro'. I took a look at the relevant m2-master behind dbproxies, db1132 and i see

Wed, Feb 12, 8:44 AM · Patch-For-Review, Operations, Gerrit
Marostegui added a comment to T241109: wikibugs needs restart almost everyday.

I have given this another restart and it started now to show comments again

Wed, Feb 12, 8:38 AM · Operations, Wikibugs
Marostegui merged task T244960: Degraded RAID on db1095 into T244958: db1095 backup source crashed: broken BBU.
Wed, Feb 12, 8:34 AM · ops-eqiad, Operations
Marostegui merged T244960: Degraded RAID on db1095 into T244958: db1095 backup source crashed: broken BBU.
Wed, Feb 12, 8:34 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T237466: Remove unused custom fields from Netbox.

Ignore that last action, wrong tab :)

Wed, Feb 12, 8:34 AM · SRE-tools, DC-Ops, netbox
Marostegui reopened T244960: Degraded RAID on db1095 as "Open".
Wed, Feb 12, 8:34 AM · ops-eqiad, Operations
Marostegui merged task T244960: Degraded RAID on db1095 into T237466: Remove unused custom fields from Netbox.
Wed, Feb 12, 8:33 AM · ops-eqiad, Operations
Marostegui merged T244960: Degraded RAID on db1095 into T237466: Remove unused custom fields from Netbox.
Wed, Feb 12, 8:33 AM · SRE-tools, DC-Ops, netbox
Marostegui renamed T244958: db1095 backup source crashed: broken BBU from db1095 backup source crashed to db1095 backup source crashed: broken BBU.
Wed, Feb 12, 8:32 AM · ops-eqiad, Operations, DBA
Marostegui assigned T244958: db1095 backup source crashed: broken BBU to jcrespo.
time=07:55
description=POST Error: 313-HPE Smart Storage Battery 1 Failure - Battery Shutdown Event Code: 0x0400. Action: Restart system. Contact HPE support if condition persists.
Wed, Feb 12, 8:31 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T244958: db1095 backup source crashed: broken BBU.

And the BBU is gone:

root@db1095:~# hpssacli  controller all show detail | grep -i battery
   No-Battery Write Cache: Disabled
   Battery/Capacitor Count: 0
Wed, Feb 12, 7:58 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T244958: db1095 backup source crashed: broken BBU.

It rebooted itself:

[07:57:01]  <+icinga-wm>	RECOVERY - Host db1095 is UP: PING OK - Packet loss = 0%, RTA = 0.25 ms
Wed, Feb 12, 7:57 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T244958: db1095 backup source crashed: broken BBU.

Looks storage related:

/system1/log1/record9
  Targets
  Properties
    number=9
    severity=Caution
    date=02/12/2020
    time=07:43
    description=Smart Storage Battery failure (Battery 1, service information: 0x0A). Action: Gather AHS log and contact Support
  Verbs
    cd version exit show
Wed, Feb 12, 7:54 AM · ops-eqiad, Operations, DBA
Marostegui created T244958: db1095 backup source crashed: broken BBU.
Wed, Feb 12, 7:52 AM · ops-eqiad, Operations, DBA
Marostegui added a comment to T242702: Test MariaDB 10.4 in production.

I have started today with weight 20 instead of weight 11 as it had yesterday.

Wed, Feb 12, 7:03 AM · DBA
Marostegui added a comment to T107610: Setup separate logical External Store for Flow in production.

It is also not clear to me who needs it, I thought it was you who was sort of leading this :-)

Implementing it, yes. But if no one actually needs it, I'd rather find something more important to do.

Wed, Feb 12, 6:55 AM · Growth-Team, DBA, Operations, WorkType-Maintenance, StructuredDiscussions
Marostegui added a comment to T240772: Prepare and check storage layer for ngwikimedia.

I have sanitized both databases, and it looks fine now.
Is it possible to create another user to make sure the triggers are applied correctly?

Wed, Feb 12, 6:47 AM · Data-Services, cloud-services-team (Kanban), DBA
Marostegui claimed T240772: Prepare and check storage layer for ngwikimedia.
Wed, Feb 12, 6:28 AM · Data-Services, cloud-services-team (Kanban), DBA
Marostegui moved T244463: Decommission dbproxy1001.eqiad.wmnet from Backlog to pending onsite steps (eqiad) on the decommission board.

Ready for DC-OPs to finish its decommissioning.

Wed, Feb 12, 6:26 AM · Operations, decommission, ops-eqiad
Marostegui reassigned T244463: Decommission dbproxy1001.eqiad.wmnet from Marostegui to Jclark-ctr.
Wed, Feb 12, 6:25 AM · Operations, decommission, ops-eqiad
Marostegui added a comment to T107610: Setup separate logical External Store for Flow in production.

@Marostegui does this task block setting up the new servers? Or does it become easier if done during that setup?
It's still not clear to me who needs this and by when (cf T106363#5495185).
(Sorry for missing your ping in December.)

Wed, Feb 12, 6:10 AM · Growth-Team, DBA, Operations, WorkType-Maintenance, StructuredDiscussions

Tue, Feb 11

Marostegui updated the task description for T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes.
Tue, Feb 11, 8:35 AM · DBA, User-jbond, Puppet, Operations
Marostegui added a comment to T242702: Test MariaDB 10.4 in production.

Given that yesterday the host responded well with traffic 1 (0.06% traffic) and 5 (0.3%), I have pooled it with weight 10 for today (0.6%)

Tue, Feb 11, 8:18 AM · DBA
Marostegui updated the task description for T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes.
Tue, Feb 11, 7:51 AM · DBA, User-jbond, Puppet, Operations
Marostegui updated the task description for T243052: Productionize es1020-es1025, es2020-es2025.
Tue, Feb 11, 7:39 AM · DBA
Marostegui updated the task description for T244463: Decommission dbproxy1001.eqiad.wmnet.
Tue, Feb 11, 7:18 AM · Operations, decommission, ops-eqiad
Marostegui updated the task description for T244463: Decommission dbproxy1001.eqiad.wmnet.
Tue, Feb 11, 6:56 AM · Operations, decommission, ops-eqiad
Marostegui updated the task description for T231280: Remove grants for the old dbproxy hosts from the misc databases.
Tue, Feb 11, 6:53 AM · DBA
Marostegui added a comment to T231280: Remove grants for the old dbproxy hosts from the misc databases.

Grants removed in db1135 with replication

root@db1135.eqiad.wmnet[(none)]> select user from mysql.user where host like '10.64.0.165';
+--------------+
| user         |
+--------------+
| bacula       |
bacula
| bacula9      |
| bloguser     |
| bugs         |
| contacts     |
| designate    |
| etherpadlite |
| haproxy      |
| librenms     |
| pdns         |
| pdns_admin   |
| rddmarc      |
| rt           |
+--------------+
Tue, Feb 11, 6:52 AM · DBA
Marostegui updated the task description for T244463: Decommission dbproxy1001.eqiad.wmnet.
Tue, Feb 11, 6:45 AM · Operations, decommission, ops-eqiad
Marostegui updated the task description for T232446: Compress new Wikibase tables.
Tue, Feb 11, 6:09 AM · DBA

Mon, Feb 10

Marostegui added a comment to T244238: Upgrade and restart m1 master (db1135).

For reference, the similar maintenance performed at T244209 resulted in 74 seconds of downtime.

Mon, Feb 10, 3:41 PM · Wikimedia-Etherpad, DBA, Operations
Marostegui added a comment to T244566: Upgrade and restart m3 (phabricator) master (db1128).

For reference, the similar maintenance performed at T244209 resulted in 74 seconds of downtime.

Mon, Feb 10, 3:40 PM · Operations, DBA, Phabricator, Release-Engineering-Team (Development services)
Marostegui updated the task description for T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes.
Mon, Feb 10, 3:07 PM · DBA, User-jbond, Puppet, Operations
Marostegui closed T244209: Upgrade and restart m5 master (db1133), a subtask of T239791: DB: perform rolling restart of mariadb daemons to pick up CA changes, as Resolved.
Mon, Feb 10, 3:07 PM · DBA, User-jbond, Puppet, Operations
Marostegui closed T244209: Upgrade and restart m5 master (db1133) as Resolved.

This has been done.
Downtime was:
15:01:13 - 15:02:27

Mon, Feb 10, 3:07 PM · cloud-services-team, wikitech.wikimedia.org, DBA, Operations
Marostegui added a comment to T244677: Weekly phabricator-reports mail cronjob broken since January 2020.

Neither monthly one. There's no January stats email. The last one is from Dec.

Mon, Feb 10, 10:40 AM · Operations, Mail, Phabricator, Regression
Marostegui triaged T244696: Remove deprecated status options from grafana in mariadb 10.4 as Medium priority.
Mon, Feb 10, 9:16 AM · DBA
Marostegui moved T244696: Remove deprecated status options from grafana in mariadb 10.4 from Triage to Backlog on the DBA board.
Mon, Feb 10, 9:16 AM · DBA
Marostegui created T244696: Remove deprecated status options from grafana in mariadb 10.4.
Mon, Feb 10, 9:09 AM · DBA
Marostegui added a comment to T237026: Page creation log cannot be viewed from oldest records, Fatal: "execution time limit of 60 seconds was exceeded".

db1107 (10.4), cloned from db1089 (that uses the bad index times) keeps choosing times, hence the bad plan.

Mon, Feb 10, 8:20 AM · MW-1.35-notes (1.35.0-wmf.8; 2019-11-26), mariadb-optimizer-bug, DBA, Core Platform Team Workboards (Clinic Duty Team), Wikimedia-production-error, Performance Issue, MediaWiki-Logging
Marostegui added a comment to T236376: SELECT /* Title::getFirstRevision */ sometimes using page_user_timestamp index instead of page_timestamp.

For what is worth, db1107 (10.4) was recloned from db1089....and db1107 shows the correct plan (uses page_timestamp) whereas db1089 shows the wrong plan (page_user_timestamp).
So maybe the optimizer is smarter on 10.4 for this particular query (or could be just coincidence) on this particular host. We'll see once we've migrated entirely.

Mon, Feb 10, 7:21 AM · Core Platform Team Workboards (Clinic Duty Team), mariadb-optimizer-bug, DBA
Marostegui updated the task description for T239453: Remove partitions from revision table.
Mon, Feb 10, 6:08 AM · DBA

Fri, Feb 7

Marostegui added a comment to T243963: es1019: reseat IPMI.

John found this: https://www.dell.com/support/article/es/es/esbsdt1/sln316859/idrac7-idrac8-idrac-unresponsive-or-sluggish-performance?lang=en which is an update from May 2019, so maybe we should try it.
Should we schedule a maintenance window to get it update?

Fri, Feb 7, 5:36 PM · DC-Ops, Operations, ops-eqiad, DBA
Marostegui added a comment to T243963: es1019: reseat IPMI.

Thanks - IPMI is back. I will take it from here.
Thank you!

Fri, Feb 7, 5:13 PM · DC-Ops, Operations, ops-eqiad, DBA