Page MenuHomePhabricator

SWAT Project (Tag)
Closed, DeclinedPublic

Description

There is a proposal to start organizing SWAT deployments on maniphest.

Reason
Managing the SWAT windows on the [[wikitech:Deployments]] page sucks. It's hard to edit and it's yet another place to manage (instead of in Phab where the tasks (and links to gerrit commits) live). Let's see if we can put it all in Phab.

Needed phab infrastructure

The process:

  1. There's a bug that gets fixed in master
  2. Someone decides it's worthy of backporting and SWATing out
  3. They create a separate (clean) task and add that to the MediaWiki-backport-deployments project
    • In that task description, you need:
      • Who will test this?
      • IRC nick?
      • Gerrit commit urls of the backports
      • Why is this being SWAT'd and not waiting for the train?
  4. They (the person doing the testing) assign the task to themselves and move the task into the appropriate MediaWiki-backport-deployments workboard column (either 15:00 UTC or 23:00 UTC)
  5. During the SWAT window the SWATer goes through that column in the order of their choosing.
  6. Normal SWAT window etiquette (be there, respond to pings, test post-deploy)
  7. Someone moves that ticket into the Done column when done.

Event Timeline

mmodell raised the priority of this task from to Needs Triage.
mmodell updated the task description. (Show Details)

@Krenair: that was me testing out a sprint, this is proposal for a real permanent project.

It looks like we will start organizing SWAT deployments on maniphest

Says who and why? Where was this discussed with SWAT deployers / people who request SWAT deploys / people who somehow unwillingly end up doing said deploys ;-)?

Rationale on why this is moving to phab is a good idea would be helpful as a first step.

Don't worry :) I just started an email thread with the people listed on Deployments page as SWATers after a suggestion from Jon R. Nothing decided yet, we just started the conversation.

When I have a moment (team offsite) I'll create a task with the summary.

greg triaged this task as Medium priority.May 19 2015, 9:06 AM
greg updated the task description. (Show Details)
greg set Security to None.
greg added a subscriber: demon.

(updated the description with the proposal as I understand it from the thread, feel free to comment on that proposal with suggestions).

Two things I notice that didn't make it here from the email discussion, so I'll mention them again for more discussion:

  1. In the email it sounded like an additional component to step 3 was to assign the task to the person who will be testing it, so it shows up on the workboard.
  2. I'm concerned that relying on the gerrit bot to dump links into the comments section of the task will lead to SWATters having to dig through the comments to find the right patches. People might have extra gerrit-bot comments from the extension updates (obscuring the core submodule update patches), might screw up the "Bug" footer (I still see experienced devs get this wrong :/ ), might screw up the patch and decide to abandon and resubmit, etc. And human comments on the task might confuse things too.

Two things I notice that didn't make it here from the email discussion, so I'll mention them again for more discussion:

  1. In the email it sounded like an additional component to step 3 was to assign the task to the person who will be testing it, so it shows up on the workboard.

Forgot about that, thanks!

  1. I'm concerned that relying on the gerrit bot to dump links into the comments section of the task will lead to SWATters having to dig through the comments to find the right patches. People might have extra gerrit-bot comments from the extension updates (obscuring the core submodule update patches), might screw up the "Bug" footer (I still see experienced devs get this wrong :/ ), might screw up the patch and decide to abandon and resubmit, etc. And human comments on the task might confuse things too.

yeah, probably not a bad idea to have them edit the description with the urls to the gerrit patches as well

  1. I'm concerned that relying on the gerrit bot to dump links into the comments section of the task will lead to SWATters having to dig through the comments to find the right patches. People might have extra gerrit-bot comments from the extension updates (obscuring the core submodule update patches), might screw up the "Bug" footer (I still see experienced devs get this wrong :/ ), might screw up the patch and decide to abandon and resubmit, etc. And human comments on the task might confuse things too.

Phabricator has a mechanism to link commits to a task that doesn't require editing the task or digging through comments on the task to find the links:

  1. go to the commit in diffusion and "edit maniphest tasks"
  2. or better yet, mention the task in the commit message with "ref Tnnn" and the task gets updated automatically.

See T96942 for an example where I added the commits to the task - this doesn't normally happen because we use the "Bug: Tnnn" convention in our commit messages which is a gerrit/bugzilla convention - we have been rejecting phabricator's conventions to our own disadvantage.

See also, this upstream commit for an example of the ref Tnnn usage, notice instead of simply "mentioning" the task, phabricator interprets it as "epriestley added a task:" and the corresponding task page gets updated with a link to the commit at the top rather than just in the transaction log.

Gerritbot would be mostly unnecessary if we would just follow phab conventions. Though, if I follow that logic to it's conclusion, gerrit would be unnecessary.

See T96942 for an example where I added the commits to the task - this doesn't normally happen because we use the "Bug: Tnnn" convention in our commit messages which is a gerrit/bugzilla convention - we have been rejecting phabricator's conventions to our own disadvantage.

Or advantage depending on how you see it. Using "ref T###" would break stuff like https://gerrit.wikimedia.org/r/#/q/bug:+96942,n,z.

@Legoktm: how would it break that?

  1. You could still say "Bug: refs T96942" and I believe that would achieve both gerrit search indexing and phab task linking.
  2. If you link the tasks in phabricator, then you can go to the task page to see the relevant commits and you wouldn't need the gerrit query to find that same information.
  1. go to the commit in diffusion and "edit maniphest tasks"

Do commits show up in Diffusion before they get merged in Gerrit? I've yet to see one.

@Anomie: no they don't show up before being merged.

Which means the Diffusion linking isn't really useful for this task until we start using Diffusion instead of Gerrit for code review.

well, that brings me to my next point... why use gerrit for code review at this point? I see no blockers for adopting differential ;)

Wouldn't that cause CI issues?

IMO: A code review tool is not the appropriate venue for branch porting requests and deployments.

Lets solve this one better than just an awkward kludge of manual workflows - the tools should perfectly match the problem rather than making the problem fit into the limitations of our tooling.

It's much easier to adjust the tooling than to waste everyone's time manually fiddling in gerrit and maniphest.

@Krenair: I think there are several solutions to CI issues - one is to make CI talk to phabricator, the alternative is to keep gerrit up stream from phab and make the final commit get pushed back to gerrit for CI treatment.

@Anomie: In situations where the branch change is just (1 or more) cherry picks from master, I don't see why we couldn't just reference the original commit on master and use a swat tool (which I will create) that will take care of automating the actual cherry pick and merge (by using the gerrit api, for now)

well, that brings me to my next point... why use gerrit for code review at this point? I see no blockers for adopting differential ;)

See T18: Plan to migrate code review from Gerrit to Phabricator. Lots of blockers listed there.

IMO: A code review tool is not the appropriate venue for branch porting requests and deployments.

You're right. A task tracker like Maniphest is better for such requests, or maybe Releeph except I can't find any documentation that says what that actually does.

But the patches to be backported and deployed need to be code reviewed, unless we're going to do an end-run around our whole code review and CI infrastructure.

Sucks that we can't yet use the code review tool (Differential) that's built to integrate with our task tracking system (Maniphest), but it is what it is.

@Anomie: In situations where the branch change is just (1 or more) cherry picks from master, I don't see why we couldn't just reference the original commit on master and use a swat tool (which I will create) that will take care of automating the actual cherry pick and merge (by using the gerrit api, for now)

That could certainly work for the common simple case. But it doesn't remove the need for actual code review on stuff like configuration changes that aren't cherry picks. And if git cherry-pick doesn't just work, it would probably be better for the requestor to resolve it rather than the SWATter who's likely not so familiar with the code involved.

That could certainly work for the common simple case. But it doesn't remove the need for actual code review on stuff like configuration changes that aren't cherry picks. And if git cherry-pick doesn't just work, it would probably be better for the requestor to resolve it rather than the SWATter who's likely not so familiar with the code involved.

So who does the code review of such changes? Is it done before the swat? If so then couldn't the person requesting the swat go ahead and get their change merged, that way the commits can link up in maniphest?

So who does the code review of such changes? Is it done before the swat? If so then couldn't the person requesting the swat go ahead and get their change merged, that way the commits can link up in maniphest?

Then we have undeployed changes sitting on the deployment branches, which might get accidentally deployed if someone else needs to push something out, would make it impossible to skip someone's patch if they weren't present on IRC during the SWAT window to test it, and in case of multiple changes to the same file would make it difficult to even know whose patch broke something.

@Anomie: I'm not sure I understand the last thing about multiple changes - the changes would be in separate commits just like they are in other situations. Merging them wouldn't prevent bisecting them. But anyway I won't argue with the rest of your reasoning about undeployed patches, I think those issues could be overcome but they are certainly valid.

On the other hand, I do have another thing to bring up for discussion: https://secure.phabricator.com/E973

Phabricator's calendar app might be the appropriate venue for organizing swats? It's been getting a lot of love upstream recently, and seems like it might be a good fit. Just food for thought, I'd love to hear what others think of this and any other suggestions, if anyone has any.

@Anomie: I'm not sure I understand the last thing about multiple changes - the changes would be in separate commits just like they are in other situations. Merging them wouldn't prevent bisecting them.

Bisecting isn't the problem. SWAT works by applying one patch at a time with the people responsible testing each one so it can be reverted right away if it obviously breaks stuff. While that could be changed to "shove out a whole stack of patches at once" as you suggest, that would be a more fundamental change than seems to be intended by this task about moving tracking of the list of SWAT patches from wikitech to phab.

On the other hand, I do have another thing to bring up for discussion: https://secure.phabricator.com/E973

Can't see it, wants a login.

Phabricator's calendar app might be the appropriate venue for organizing swats? It's been getting a lot of love upstream recently, and seems like it might be a good fit.

Personally I'm a bit skeptical of yet-another calendar app just as I am about Phab's yet-another messaging app, mainly due to the "yet another" aspect. But I haven't seen it so I can't say anything more specific.

@Anomie: Regarding swat, I didn't intend to suggest fundamentally changing the way swat works, you wouldn't have to deploy the patches all at once, you could do them one at a time. I'm simply trying to make the tools automate as much of the process as possible so that there is little opportunity for mistakes, so things run smoothly. Nothing more. Even an automated process can have a "pause" in between patches.

Regarding calendar: Thanks to the latest upstream updates it is essentially just like a maniphest task with a date and time. I tried to open up the calendar app here but that change got undone. Here's an example event on labs: https://phab-01.wmflabs.org/E2

I've been playing with releeph a little locally, it's actually closer to being usable than what I had previously thought.

releeph.png (565×1 px, 28 KB)

(No one else can actually see that, application is restricted to administrators)

hashar subscribed.

Seems @mmodell is leading this effort.

@Krenair: releeph should should be accessible now.

Y1 seems hidden to logged out users? And I can't seem to link it automatically in comments?

@Krenair: yeah, the application is designed by Facebook and not fully integrated into phabricator, so there are likely to be some rough edges.

And also when I got the email about it yesterday, https://phabricator.wikimedia.org/RQ1 was linked - but RQ1 does not get linked automatically either, and that URL just redirects to Y1?

The phabricator calendar app is another tool that might be suitable for organizing swat deploys.

Which one do you prefer?

Or both together?

Do we want to wait on this? The ideal workflow would happen in Differential I presume?

eg: simply proposing a backport to a live wmf.XX branch, Release-Engineering-Team and/or #swatters (I just made that up, to represent the group of people who can/volunteer to do SWATs) would be reviewers, and they approve/accept then deploy during a SWAT, the backlog would be open Diffs to live wmf.XX branches.

Or am I crazy?

(<insert rant about wanting Release-Engineering-Team and #swatters (whatever the name) to be restricted membership but without renaming them to ugly acl*blah>)

mmodell changed the task status from Open to Stalled.Apr 11 2016, 3:54 PM

@Luke081515: Probably both together.
@greg: yes ideally differential would be involved.

There wasn't much interest in this experiment. Closing.