Just making this meta-task so that I don't forget anything I want to cover. Also this gives us a task to associate actions like the rolling restart to, given that it's not tied to a specific ticket
- (fri dec 17) get shell access && puppet-merge
- (mon dec 20) configure pwstore -> ban elastic1043 from cluster -> ssh into mgmt console of elastic1043 and perform a power cycle (it won't do anything because the node is borked); wdqs & wcqs deploys
- (tues dec 21) elasticsearch rolling restart (if we do eqiad or codfw it might break a dump, which isn't a huge deal, but we may want to just do cloudelastic for that reason)
- (weds dec 22) Briefly go over the incident documentation process: https://wikitech.wikimedia.org/wiki/Incident_status
No pairing thurs dec 23 (ryan OOO)