Page MenuHomePhabricator

Amsterdam maintenance (June 2020)
Closed, ResolvedPublic

Description

We paused (potentially) disruptive work in esams/knams, but enough things have piled up to the point where should depool esams for ~2h and do all the work that can be done remotely.

So what needs to happen:

  • Reboot cr2-esams for T253970 and T245520
  • Upgrade cr3-esams for T243080
  • Upgrade cr3-knams for T243080 and T244497

Optionally upgrade asw2-esams for T252631, but this is very impactful (20min hard downtime for the site) and only for a cosmetic issue so I suggest we keep it for another time.

Event Timeline

ayounsi triaged this task as Medium priority.May 29 2020, 4:14 PM
ayounsi created this task.
Restricted Application added a project: Operations. · View Herald TranscriptMay 29 2020, 4:14 PM
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
CDanis added a subscriber: CDanis.May 29 2020, 4:15 PM
BBlack added a subscriber: BBlack.May 29 2020, 4:20 PM

Scheduling it for this Wednesday June 3rd at 6am UTC, 2h window for a 1h work.

RobH mentioned this in Unknown Object (Task).Jun 2 2020, 6:57 PM

Change 601951 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Depool esams for network maintenance

https://gerrit.wikimedia.org/r/601951

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:38:11Z] <XioNoX> deactivate graceful-switchover on cr3-esams - T254021

Change 601951 merged by Ayounsi:
[operations/dns@master] Depool esams for network maintenance

https://gerrit.wikimedia.org/r/601951

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:48:55Z] <XioNoX> depool esams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:51:13Z] <XioNoX> deactivate transit BGP ton cr3-knams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T05:58:09Z] <XioNoX> reboot cr3-knams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T06:08:18Z] <XioNoX> re-activate transit BGP to cr3-knams - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T07:00:19Z] <XioNoX> re0.cr2-esams> request system reboot both-routing-engines - T254021

Mentioned in SAL (#wikimedia-operations) [2020-06-03T07:09:06Z] <XioNoX> re-activate peering/transit BGP on cr2-esams - T254021

Change 602007 had a related patch set uploaded (by Ayounsi; owner: Ayounsi):
[operations/dns@master] Revert "Depool esams for network maintenance"

https://gerrit.wikimedia.org/r/602007

Change 602007 merged by Ayounsi:
[operations/dns@master] Revert "Depool esams for network maintenance"

https://gerrit.wikimedia.org/r/602007

Mentioned in SAL (#wikimedia-operations) [2020-06-03T07:15:57Z] <XioNoX> repool esams - T254021

ayounsi updated the task description. (Show Details)Jun 3 2020, 7:19 AM
ayounsi closed this task as Resolved.Jun 3 2020, 7:24 AM

Everything got done smoothly, no user impact.

T253970 and T244497 are still not solved.
T245520 is solved.

I also used the opportunity to have esams depooled and running >17.3 to apply T247073.