Page MenuHomePhabricator

magru: upgrade routers & switches (2026)
Closed, ResolvedPublic

Description

To solve both {T414788} and hopefully T412143: ~5k/logs/sec from netdev

  • mr1-magru (to 23.4R2-S4 or more recent)
  • cr1-magru (to 23.4R2-S4 or more recent)
  • cr2-magru (to 23.4R2-S4 or more recent)
  • asw1-b3-magru (to 23.4R2-S7 or more recent)
  • asw1-b4-magru (to 23.4R2-S7 or more recent)

Related Objects

StatusSubtypeAssignedTask
OpenNone
Resolvedayounsi

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
ayounsi triaged this task as Medium priority.Feb 9 2026, 3:39 PM

Scheduled for this Tuesday 17th, 13:00-16:00 UTC

Mentioned in SAL (#wikimedia-operations) [2026-02-17T13:00:34Z] <ayounsi@cumin1003> START - Cookbook sre.dns.admin DNS admin: depool site magru [reason: no reason specified, T416442]

Mentioned in SAL (#wikimedia-operations) [2026-02-17T13:00:46Z] <ayounsi@cumin1003> END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool site magru [reason: no reason specified, T416442]

Icinga downtime and Alertmanager silence (ID=a08d08c2-5950-4fed-a93a-c3c2670c6e3e) set by ayounsi@cumin1003 for 2:00:00 on 2 host(s) and their services with reason: router upgrade

cr1-magru,cr1-magru IPv6

Icinga downtime and Alertmanager silence (ID=7f778211-928f-4b3a-b850-c48eecb6889b) set by ayounsi@cumin1003 for 2:00:00 on 2 host(s) and their services with reason: router upgrade

cr2-magru,cr2-magru IPv6

Icinga downtime and Alertmanager silence (ID=99c5c739-235a-4c04-8c28-d8b0ef056232) set by ayounsi@cumin1003 for 2:00:00 on 40 host(s) and their services with reason: Switches upgrade

bast7002.wikimedia.org,cp[7001-7016].magru.wmnet,dns[7001-7002].wikimedia.org,doh[7003-7004].wikimedia.org,durum[7003-7004].magru.wmnet,ganeti[7001-7004].magru.wmnet,hcaptcha-proxy[7001-7002].wikimedia.org,install7002.wikimedia.org,lvs[7001-7003].magru.wmnet,ncredir[7003-7004].magru.wmnet,netflow7002.magru.wmnet,prometheus7002.magru.wmnet,tcp-proxy[7001-7002].magru.wmnet,testvm7001.magru.wmnet

Mentioned in SAL (#wikimedia-operations) [2026-02-17T13:23:19Z] <XioNoX> cr1-magru> request vmhost reboot - T416442

Mentioned in SAL (#wikimedia-operations) [2026-02-17T13:47:58Z] <XioNoX> cr2-magru> request vmhost reboot - T416442

Mentioned in SAL (#wikimedia-operations) [2026-02-17T14:03:56Z] <XioNoX> asw1-b3-magru> request system reboot - T416442

Icinga downtime and Alertmanager silence (ID=349bbdfb-74d0-421c-9a4c-07e586c71db9) set by ayounsi@cumin1003 for 2:00:00 on 3 host(s) and their services with reason: router upgrade

asw1-b3-magru,asw1-b3-magru IPv6,asw1-b3-magru.mgmt

Icinga downtime and Alertmanager silence (ID=709d9f76-c776-4510-a7e0-1c4545cf4710) set by ayounsi@cumin1003 for 2:00:00 on 3 host(s) and their services with reason: router upgrade

asw1-b4-magru,asw1-b4-magru IPv6,asw1-b4-magru.mgmt

Mentioned in SAL (#wikimedia-operations) [2026-02-17T14:23:28Z] <XioNoX> asw1-b4-magru> request system reboot - T416442

Mentioned in SAL (#wikimedia-operations) [2026-02-17T14:44:50Z] <XioNoX> mr1-magru> request system reboot - T416442

Mentioned in SAL (#wikimedia-operations) [2026-02-17T15:06:40Z] <sukhe@cumin1003> START - Cookbook sre.dns.admin DNS admin: pool site magru [reason: Xionix maint work done, T416442]

Mentioned in SAL (#wikimedia-operations) [2026-02-17T15:06:57Z] <sukhe@cumin1003> END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool site magru [reason: Xionix maint work done, T416442]

Mentioned in SAL (#wikimedia-operations) [2026-02-17T15:07:03Z] <sukhe@cumin1003> START - Cookbook sre.dns.admin DNS admin: pool site magru [reason: XioNoX: maint work done, T416442]

Mentioned in SAL (#wikimedia-operations) [2026-02-17T15:07:06Z] <sukhe@cumin1003> END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool site magru [reason: XioNoX: maint work done, T416442]

ayounsi claimed this task.
ayounsi updated the task description. (Show Details)