Page MenuHomePhabricator

Amsterdam maintenance (June 2020)
Closed, ResolvedPublic

Description

We paused (potentially) disruptive work in esams/knams, but enough things have piled up to the point where should depool esams for ~2h and do all the work that can be done remotely.

So what needs to happen:

  • Reboot cr2-esams for T253970 and T245520
  • Upgrade cr3-esams for T243080
  • Upgrade cr3-knams for T243080 and T244497

Optionally upgrade asw2-esams for T252631, but this is very impactful (20min hard downtime for the site) and only for a cosmetic issue so I suggest we keep it for another time.

Event Timeline

ayounsi created this task.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Scheduling it for this Wednesday June 3rd at 6am UTC, 2h window for a 1h work.

RobH mentioned this in Unknown Object (Task).Jun 2 2020, 6:57 PM

Change 601951 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Depool esams for network maintenance

https://gerrit.wikimedia.org/r/601951

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:38:11Z] <XioNoX> deactivate graceful-switchover on cr3-esams - T254021

Change 601951 merged by Ayounsi:
[operations/dns@master] Depool esams for network maintenance

https://gerrit.wikimedia.org/r/601951

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:51:13Z] <XioNoX> deactivate transit BGP ton cr3-knams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T06:08:18Z] <XioNoX> re-activate transit BGP to cr3-knams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T07:00:19Z] <XioNoX> re0.cr2-esams> request system reboot both-routing-engines - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T07:09:06Z] <XioNoX> re-activate peering/transit BGP on cr2-esams - T254021

Change 602007 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Revert "Depool esams for network maintenance"

https://gerrit.wikimedia.org/r/602007

Change 602007 merged by Ayounsi:
[operations/dns@master] Revert "Depool esams for network maintenance"

https://gerrit.wikimedia.org/r/602007

Everything got done smoothly, no user impact.

T253970 and T244497 are still not solved.
T245520 is solved.

I also used the opportunity to have esams depooled and running >17.3 to apply T247073.