Page MenuHomePhabricator

Add a European mid-day SWAT window
Closed, ResolvedPublic

Tokens
"Love" token, awarded by Addshore."Like" token, awarded by Luke081515."Love" token, awarded by KartikMistry."Like" token, awarded by RobH."Love" token, awarded by hashar.
Assigned To
Authored By
greg, Jun 16 2016

Description

Time slots

(remember, they're 'pinned' to SF/Pacific timezone so they change due to DST)

  • add a European window at 6am Pacific (13:00 UTC)
  • move the 8am (15:00 UTC) window to 11am (18:00 UTC)
  • keep the 4pm (23:00 UTC) window as is

(a SWAT every 5 hours from 13:00 to 23:00 UTC)

Instigation of this task

As a follow up, we quickly talked about opening a SWAT deployment window during European morning. I have poked Releng private list about it, we will see about organizing / opening such a slot!

As I'm currently only working European morning hours/attending any meetings in SF morning hours but I still have the capacity (and the privs!) to deploy, a European morning SWAT deployment window would be great.

Event Timeline

greg created this task.Jun 16 2016, 3:25 PM
Restricted Application added subscribers: Zppix, Aklapper. · View Herald TranscriptJun 16 2016, 3:25 PM

From my private email:

Yesterday during wmf.6 deploy, the Mobile Special:Nearby got broken. I was too tired to think/call a rollback.

Eventually US managed to get a fix and during European morning, the developer had a tested patch for wmf.6. It has been scheduled for the morning SWAT deploy window which is 6pm in Europe.

Having Mobile developers around, I took on myself to just push the patch to production (all went fine so far).

Talking to the mobile developers, they would be more than happy to be able to swat deploy during Europe morning. Either to finish US work or for their own changes.

So question for us: should we open SWAT slots for Europe mornings and potentially Europe early afternoon? We have ample ops coverage, I dont main training folks to do the deploy and have them enrolled in the European SWAT teams :D

I have posted to the private ops list about it pointing to this task and proposing 2pm CET / noon UTC.

RobH awarded a token.Jun 16 2016, 4:29 PM
RobH added a subscriber: RobH.

We have plenty of developers in europe, i think this is a great idea.

Dereckson added a subscriber: Dereckson.EditedJun 16 2016, 5:14 PM

If such window is acceptable from an ops point of view, I can be available during this time slot.

I support this proposal.

Release-Engineering-Team is working on a proposal that slightly adjust how we do SWAT and would include an European SWAT window. Most probably in the morning so it also covers India afternoon. Stay tuned :-)

greg triaged this task as Medium priority.Jun 30 2016, 6:19 PM
greg moved this task from INBOX to In-progress on the Release-Engineering-Team board.
greg added a comment.EditedJul 6 2016, 11:08 PM

Here's my proposal for the new SWAT timeslots (remember, they're 'pinned' to SF/Pacific timezone):

  • add a European window at 6am Pacific (13:00 UTC)
  • move the 8am (15:00 UTC) window to 11am (18:00 UTC)
  • keep the 4pm (23:00 UTC) window as is

(a SWAT every 5 hours from 13:00 to 23:00 UTC)

We'll need to figure out who will be the inaugural European SWAT members. We might need to train some people for that.

greg renamed this task from Proposal: Add a European mid-day SWAT window to Add a European mid-day SWAT window.Jul 6 2016, 11:09 PM
Krinkle added a subscriber: Krinkle.EditedJul 6 2016, 11:36 PM

I'd like to propose to change the SWAT process to require syncing to a canary server first.

We already require that a peer verify the fix after the fact. Instead of having this person verify the fix in production, have the peer verify it on the canary server using the WikimediaDebug extension in Chrome or Firefox.

The deployer simply stages it on tin, and instead of syncing directly, scap pull on mw1017 first. Wait for verification and then sync it to the cluster.

We can consider removing the canary step after T121597 is resolved. Note that T121597 will also catch when syncing files in the wrong order (which manual canary testing typically doesn't catch).

greg added a comment.Jul 7 2016, 1:13 AM

@Krinkle sure, let's look into that, but it's tangential to this task.

greg added a comment.Jul 7 2016, 5:23 AM

Tangent:

I'd like to propose to change the SWAT process to require syncing to a canary server first.

Is https://wikitech.wikimedia.org/wiki/SWAT_deploys#Doing_the_deploy now reflecting what you had in mind? Feel free to fix things or file a separate task.

This comment was removed by JanZerebecki.
greg updated the task description. (Show Details)Jul 8 2016, 7:14 PM
greg added a comment.Jul 15 2016, 6:37 PM

For those not subscribed to the sub-task:

My instinct is not to start this next week for these reasons:

  • it's Friday and I still haven't announced it (though I have poked the ops@list once this week and once the week before, to warn them with no responses)
  • Antoine and Zeljko (the two members of RelEng who can be around at that time) will only be around for 1 week before they both go on "European-length summer vacations" :)
  • With the above: I don't have anyone other than @Dereckson lined up who affirmatively can do the window and has not expressed a desire for training. I'm worried about overloading Dereckson and/or not being able to cover a SWAT when there are requested patches. It's not wise to start a new service with a SPOF :)

With the above, and after briefly chatting with @hashar and @zeljkofilipin to sanity check, I don't plan to institute this until after they both return from vacation (the week of August 22nd).
Of course, I'm open to people commenting here who A) don't need extra training (iow: you've done a SWAT before) and B) are willing to join the SWAT team at least until Aug 22nd. If 2 people come forward who meet that criteria we can start it early.

Krinkle removed a subscriber: Krinkle.Jul 16 2016, 1:30 AM

After discussions with @greg-g and others lets start it on the week of August 22nd. I will be back from vacations and all fresh to baby sit / train up people. Ideally we would have some patches to actually deploy.

Seems the slot is going to be 13:00 UTC / 15:00 CEST .

@greg-g can you possibly update whatever template is used to create https://wikitech.wikimedia.org/wiki/Deployments and add the European SWAT slot?

Then I guess we can announce it the week prior, mail all folks and have this T137970 and sub task T139544 marked as solved.

I am excited.

greg added subscribers: Jhernandez, Addshore.

From the subtask of who will do it:

Experienced deployers (can help train)
These are the names that will be initially in the list of SWATers

Need/want (more) training
These individuals can add themselves to the list of SWATers as they gain experience pairing with those above

greg added a comment.Jul 28 2016, 6:12 PM

Added week of August 22nd to the Deploy calendar:
https://wikitech.wikimedia.org/wiki/Deployments#Week_of_August_22nd

Added corresponding events/changed the morning SWAT in the gcalendar ("WMF Deployments") (not sure how many people look there? you can add it in your personal google account via adding <wikimedia.org_rudis09ii2mm5fk4hgdjeh1u64@group.calendar.google.com>)

Next step is to announce it (will do next week, I think).

greg claimed this task.Jul 28 2016, 6:38 PM
Restricted Application added a project: User-greg. · View Herald TranscriptJul 28 2016, 6:38 PM
greg moved this task from Backlog to In Progress on the User-greg board.Jul 28 2016, 6:38 PM
greg added a comment.Aug 3 2016, 10:42 PM

@Addshore I hear your morning SWAT activities are going well :) Can I add you to the list of names for the inaugural mid-day European SwAT members list at https://wikitech.wikimedia.org/wiki/Deployments#Week_of_August_22nd ?

aude added a subscriber: aude.Aug 4 2016, 12:38 AM

I can't do swat everyday, but probably can help some of the time with this new timeslot

greg added a comment.Aug 4 2016, 5:19 PM

I can't do swat everyday, but probably can help some of the time with this new timeslot

so to be explicit, should I just add your name to the "to be pinged" list then (like I did for addshore)?

aude added a comment.Aug 4 2016, 5:24 PM

I can't do swat everyday, but probably can help some of the time with this new timeslot

so to be explicit, should I just add your name to the "to be pinged" list then (like I did for addshore)?

Sure, please add me.

greg added a comment.Aug 4 2016, 5:28 PM

Sure, please add me.

{{done}} thanks!

greg updated the task description. (Show Details)Aug 4 2016, 5:38 PM
greg closed this task as Resolved.Aug 9 2016, 6:02 PM

Alright, this is announced. Now all we have to do is wait until August 22nd. Resolving.

I have created a lame survey to get some feedback following the first week of European SWAT window at T143894.