Page MenuHomePhabricator

Create a continuous integration plan for Wikimedia Phabricator patches
Closed, DeclinedPublic21 Estimated Story Points

Description

This task was created with the Sprint app as the main use case, but the solution must be equally applicable to any local patches.

Sprint is an experimental stage application with issues that is undergoing rapid development and attempting to accommodate the needs of a variety of uses. A regular testing, review and deployment plan should be implemented to facilitate the continuous integration of Sprint with Phabricator main.

There is an important bug fix that has been sitting in Gerrit since 11 December with no action. (https://gerrit.wikimedia.org/r/#/c/179155/). A hotfix like this could and should have been handled promptly. If it is a question of concern about "making a bad situation worse", this is absolutely not the case with T78208. This is a very simple problem with an easy solution. Any reluctance to merge this change is not warranted. However, thorough code review is always appropriate.

Furthermore, Sprint is now at 0.6.1.9, and the additional patches will absolutely improve the user experience and address some of the key problems noted in the first go. When or if these changes will be merged is clearly not at the discretion of the developer, but with the maintainers of the production instance. I encourage those responsible to take this task to task.

Event Timeline

Christopher raised the priority of this task from to Needs Triage.
Christopher updated the task description. (Show Details)
Christopher changed Security from none to None.
Christopher subscribed.

While I agree that we need to find a way to integrate Sprint releases sooner and faster, it is good to take into account that last week we had the RT migration, which was our top priority, eating all of @chasemp's time. It was also a good reason to avoid the introduction of any other risk factors. For what is worth, https://gerrit.wikimedia.org/r/#/c/179405/ for T493 and T76008 hasn't been progressing much in the past week basically for the same reasons.

Qgil triaged this task as Medium priority.Dec 22 2014, 8:22 AM
Qgil moved this task from To Triage to Need discussion on the Phabricator board.
Qgil renamed this task from Create a continuous integration plan for the Sprint extension to Create a continuous integration plan for Wikimedia Phabricator patches.Dec 29 2014, 12:36 PM
Qgil updated the task description. (Show Details)

I have edited the task to make it generic to any local patches. The Sprint app is affected by our lack of process, but it is not the only local development in this situation. @chasemp, @mmodell and (luckily!) a growing number of contributors would also benefit of a single and documented process. Let's define the solution for all cases and let's document it at https://www.mediawiki.org/wiki/Phabricator/Code

We had a casual discussion in our last team meeting a week ago. Some ideas for the defaults that should be enforced, unless there is an Unbreak Now emergency:

  • Patches should be deployed first in phab-01 or another public instance for testing.
  • Patches should go through code review, without self-merges.
  • Deployment of new patches should happen during the maintenance windows.

While this might look and might indeed be slower in the short term, it will lead to faster deployments and a stronger team of Wikimedia Phabricator contributors. Currently @chasemp is the only gatekeeper and he has to review and take responsibility at deploy time. The only feedback possible is to postpone the deployment and ask for feedback on IRC or some semi-random task. Not efficient, not fair, and it doesn't scale.

I would like to deploy the latest version of Sprint 0.6.2.7 to production. I have been proactive in implementing fixes in order to keep Sprint in sync with the latest upstream master, but the upstream changes are not backwards compatible and old methods are not deprecated, just eliminated. Thus it is not possible to update Sprint to 0.6.2.7 without updating phabricator master and libphutil master (to at least 3 Jan) as well. The test instance http://phab08.wmflabs.org is now current with phabricator revisions: https://secure.phabricator.com/D11214 and libphutil: https://secure.phabricator.com/D11188 (Jan 5 2015).

I think that it is not advisable to pull down any upstream changes without concurrent updates to all dependencies (patches and extensions). An unscheduled upstream pull now would break things, (unfortunately because of the Sprint extension...) See T85060 for details. The first step in developing this process is in determining how close to upstream http://phabricator.wikimedia.org would like to be.

I just want to emphasize that the coordination of a planned update cycle (during the maintenance window) with patch and extension updates is essential.

@Christopher: I agree, we need to work out a way to coordinate our efforts so that upstream changes are minimally disruptive to each of our workflows.

In attempting to deploy the security extension for T518 I ran into the same issues with upstream changes that are not backwards compatible.

We need to get these upstream changes integrated and I don't know how to best coordinate versions of each extension.

Maybe it would look something like this:

Once or twice each month we pull all upstream changes into a new branch of phabricator in our own repo with an ubuntu-style version number like phab-wm-15.01 and then we have corresponding branches of each extension that should be compatible, and then a week pulling the upstream branch we can deploy everything? That should give us time to fix compatibility issues introduced by the upstream changes.

This doesn't really seem ideal to me so i'm open for suggestions.

@chasemp, @Christopher: do either of you have any further thoughts?

We discussed this today on the release engineering team meeting. Not much has been decided except that we should definitely build out some sort of process for this.

Some prerequisites for continuous integration and deployment:

  1. automated testing of the wikimedia extensions to phabricator - perhaps the best way to do this is using the phabricator unit test engine.
  2. versioning scheme and branching plan
  3. production deployment plan

Apparently we want to run a set of tests when a patch is proposed to the Sprint extension to ensure it is going to work with phabricator/phabricator and phabricator/libphutil as they are deployed on Wikimedia production.

We can craft a Jenkins job that would clone the three repositories, checkout a tag / branch tracking what is deployed and apply the Sprint patchset. Then run whatever magic command manage to setup the test env and run the tests.

If you have some step by step tutorial as to how to run the tests that would be a good basis to copy paste in a Jenkins job. From there we can elaborate to use Zuul cloner which has all the logic to checkout appropriate branch / proposed patchset.

Looking at Sprint, the composer.json list a phabricator version on github, which would defeat the idea of using the phabricator as deployed on Wikimedia. No clue how to handle that part (i.e. ignore the composer.json provided version).

Change 183094 had a related patch set uploaded (by Hashar):
phabricator job to run arc lint on all repo

https://gerrit.wikimedia.org/r/183094

Patch-For-Review

Aklapper raised the priority of this task from Medium to High.Jan 7 2015, 8:38 PM

Thanks Hashar for creating that Jenkins job!

As described earlier, development processes differ: The Sprint extension was/is targeting upstream git master, the Security extension was/is targeting against our snapshot.

And currently deployment is "We reserve a 30min window every Wednesday to potentially deploy a new version" but that might not be sufficient for planning.

Was naively wondering earlier: Something like "Pull revision XY from upstream on every n'th Monday of the month, test that on a Labs instance if it works well with our custom extensions, and if yes deploy that (two days old) revision XY on the n'th Wednesday on that month to production" or such.
But the "two days on a Labs instance" part might be mood with that Jenkins job. So whatever the Release engineering team's recommendations are.

Thanks Hashar and Andre. Just to clarify. I have been using Scrutinizer for a build which runs the tests and gives a coverage report. In the scrutinizer.yml, the details for the test config are indicated. The composer.json file points to whatever versions of phabricator and libphutil that I push into the github repos that are used for the build. This could be changed to point to the wikimedia fork of phabricator on github.

@Christopher: I didn't realize that only master gets replicated to github. I'll merge production into master so you can follow that if that helps. But I wonder if there is a way to get other branches pushed to github?

actually, the production branch does get pushed to github... but actually, the problem is that a composer.json file has to be in the tree for the build package to work.

Maybe you could add a composer.json to your branch? Something like the one here: https://github.com/christopher-johnson/phabricator/blob/master/composer.json

Actually, the production branch is on github: https://github.com/wikimedia/phabricator-phabricator/tree/production and that includes the latest upstream code as of a few hours ago. That is the version we will push into production as soon as there is an opportunity to do so. I'll need to coordinate on this with @chasemp.

@Christopher: Can you test your code against the tip of the production branch? Assuming everything is ok then push your own code into a branch named production and tag it with a release tag like release/yyyy-mm-dd/1

Also, when you have changes that are intended to go into production, please submit a change to gerrit with /refs/for/production/ and CC myself (uername: 20after4 in gerrit) and Chase (username: rush in gerrit) - I will gladly provide +2 reviews in a timely manner, but we can't put code into production if it hasn't at least received a cursory review by someone on the wikimedia phabricator team.

Do I still trigger the request for deployment with a change to operations/puppet/manifests/role/phabricator.pp? If we are synching changes with tags, only one change, probably yours, to puppet will get merged.

I noted in T78243 that the tag and branch have been created as suggested. From my end, I think that Sprint is ready to go now, but I welcome a code review prior to merging.

@Christopher: We will deploy this as soon as chase has the time to do so. You don't need to make the change to puppet, chase or I can take care of that part.

Another aspect: Testing whether bots (Gerritbot etc) break: T89967

chasemp lowered the priority of this task from High to Medium.Mar 11 2015, 9:11 PM

Change 183094 abandoned by Hashar:
phabricator job to run arc lint on all repo

https://gerrit.wikimedia.org/r/183094

@Aklapper Title mentions "Create a plan" - and plans should be documented so they can be followed... but DWYW...

Aklapper removed a project: Zuul.

@Paladox: Reverting latest edits as Zuul might be one outcome but not necessarily the outcome. Personal opinions should be expressed in a comment to keep the task description a summary.

Oh sorry. Here's my personal opinion

We should develop a Zuul like interface and system for phabricator.

The reasons are it made it easy to add mutiple jobs to one project. Also made it easy to view what tests are running for what project. It also supported spreading it across the nodes. It also allowed us to have a repo where users could submit patches for the tests they wanted and someone from releng reviewed them which made everything easy and open sourced. In phabricator none of this is possible. Zuul also had an easy view in the web browser whereas phabricators builds did not.

Because it will defiantly be hard to look at running builds.

greg lowered the priority of this task from Medium to Low.Aug 1 2019, 11:10 PM

This task hasn't had activities in years, its purposes is unclear to me and I have no idea what should be achieved to complete it. If there is still any interest, please reopen and amend the task description with something we can act on.