Page MenuHomePhabricator

drmrs: upgrade routers & switches (2026)
Closed, ResolvedPublic

Description

To solve both {T414788} and T413181: asw1-b12-drmrs stopped reporting metrics

  • mr1-drmrs (to 23.4R2-S4 or more recent)
  • cr1-drmrs (to 23.4R2-S4 or more recent)
  • cr2-drmrs (to 23.4R2-S4 or more recent)
  • asw1-b12-drmrs (to 23.4R2-S7 or more recent)
  • asw1-b13-drmrs (to 23.4R2-S7 or more recent)

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ayounsi triaged this task as Medium priority.Mon, Feb 9, 3:37 PM

Icinga downtime and Alertmanager silence (ID=45e4c583-8cc2-4f44-b2d8-b0953da180ed) set by ayounsi@cumin1003 for 2:00:00 on 3 host(s) and their services with reason: router upgrade

cr1-drmrs,cr1-drmrs IPv6,cr1-drmrs.mgmt

Mentioned in SAL (#wikimedia-operations) [2026-02-10T14:24:31Z] <XioNoX> cr1-drmrs> request vmhost reboot - T416441

Icinga downtime and Alertmanager silence (ID=4116c622-b06d-4dc3-83e0-708605d2590f) set by ayounsi@cumin1003 for 2:00:00 on 20 host(s) and their services with reason: Switch upgrade

bast6003.wikimedia.org,cp[6001,6003,6005,6007,6009,6011,6013,6015].drmrs.wmnet,dns6001.wikimedia.org,doh6001.wikimedia.org,durum6001.drmrs.wmnet,ganeti[6001,6003].drmrs.wmnet,hcaptcha-proxy6001.wikimedia.org,install6003.wikimedia.org,lvs[6001,6003].drmrs.wmnet,ncredir6001.drmrs.wmnet,tcp-proxy6001.drmrs.wmnet

Mentioned in SAL (#wikimedia-operations) [2026-02-10T14:41:38Z] <XioNoX> asw1-b12-drmrs> request system reboot - T416441

Icinga downtime and Alertmanager silence (ID=c138fd0b-7ec5-4888-ba82-07c202fe09f5) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s) and their services with reason: router upgrade

cr2-drmrs,cr2-drmrs IPv6,cr2-drmrs.mgmt

Mentioned in SAL (#wikimedia-operations) [2026-02-10T15:02:47Z] <XioNoX> cr2-drmrs> request vmhost reboot - T416441

Icinga downtime and Alertmanager silence (ID=accf5635-89d8-4d71-8aaa-9929ce216d2c) set by ayounsi@cumin1003 for 1:00:00 on 19 host(s) and their services with reason: Switch upgrade

cp[6002,6004,6006,6008,6010,6012,6014,6016].drmrs.wmnet,dns6002.wikimedia.org,doh6002.wikimedia.org,durum6002.drmrs.wmnet,ganeti[6002,6004].drmrs.wmnet,hcaptcha-proxy6002.wikimedia.org,lvs6002.drmrs.wmnet,ncredir6002.drmrs.wmnet,netflow6001.drmrs.wmnet,prometheus6002.drmrs.wmnet,tcp-proxy6002.drmrs.wmnet

Icinga downtime and Alertmanager silence (ID=b46dbf23-ece7-47a6-a4a1-d021c4bef5c7) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s) and their services with reason: router upgrade

asw1-b13-drmrs,asw1-b13-drmrs IPv6,asw1-b13-drmrs.mgmt

Icinga downtime and Alertmanager silence (ID=19b65563-0cd8-4d77-a6d7-66b2c8af497a) set by ayounsi@cumin1003 for 1:00:00 on 3 host(s) and their services with reason: router upgrade

mr1-drmrs,mr1-drmrs IPv6,mr1-drmrs.oob

Mentioned in SAL (#wikimedia-operations) [2026-02-10T15:41:36Z] <XioNoX> mr1-drmrs> request system reboot - T416441

ayounsi claimed this task.

All done.

Mentioned in SAL (#wikimedia-operations) [2026-02-10T16:09:00Z] <sukhe@cumin1003> START - Cookbook sre.dns.admin DNS admin: pool site drmrs [reason: work done, T416441]

Mentioned in SAL (#wikimedia-operations) [2026-02-10T16:09:21Z] <sukhe@cumin1003> END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site drmrs [reason: work done, T416441]