Page MenuHomePhabricator

[EPIC] (WIP) End-to-end tests and deploys
Closed, InvalidPublic

Description

This is a Work In Progress proposal that the Release-Engineering-Team team is working through. This is not done and it should not be considered an official plan at this time.

End-to-end tests and deploys

General hope

That we can find a way to leverage the browser (end-to-end) tests to increase confidence in what we are deploying (more than they do now).

Current status

We have some end-to-end tests that run on patch submission thanks to Dan's work on the https://integration.wikimedia.org/ci/job/mwext-mw-selenium/ job. These are generally voting and provide useful/actionable feedback to developers.

But we also have our "daily" browser tests (which is, I believe, a super-set of the ones that run on patch submission). These run against the Beta Cluster and that itself introduces problems (code updating during a test run, for one). They are also impossible to tie directly to a specific problematic commit.

Proposals

Generally, using the browser tests as targetted runs against the group0 wikis after the new branch has been cut, and test failure for core and each extensions during the run will be met with escalation to project owners and potentially, barring resolution of the failure by each owner, some sort of rollback action (details of this depend on how we refactor the branching process).

All proposals have these pre-reqs:

  • Increase adoption of end-to-end tests pre-merge (voting, mwext-mw-selenium)
  • Keep daily browser tests
    • Continue the work on developer self-maintenance of the tests
  • Setup enough testwikis (iow: expand group0 with more testwikis) to thoroughly test centralauth/sessionamanger/authmanager interwiki behavior.
  • Get to a place where we can consistently run all e2e tests against group0 without failure
  • Move group0 deploy to Monday. Keep group1/group2 where they are (Wed/Thurs).
    • A sub-proposal here is to move group0 to Thursday the week before. Main downside is it would increase the age of code being deployed to production by about 5 days.

Proposal 1

The "Freeze until Green" proposal. This is the simplest proposal wrt where we are now.

  1. Cut new branch as we do now
  2. Deploy to group0 on Monday
  3. Run all e2e tests
  4. If any test fails, freeze until are all green
    1. Fixes are driven by code owners, they can revert or fix if needed
  5. After green, go to group1 and group2

Positive: this puts the onous on the developers to fix this fast. They have 48 hours ish (if we deploy Monday morning to group0 and then have group1 scheduled for Wednesday).

Proposal 2

The long-lived branch version of the above.

Main difference is that we need to figure out a way of not pulling problematic code from the group0 branch to group1. This is hard because of cross repo dependencies (common ones are MobileFrontEnd and Flow, VE and Flow, etc). I've yet to hear a sane proposal here (not that one doesn't exist, just, please do share if you have it!).

Proposal 3

  • all of the same pre-reqs as above but,
  • we don't pull new updates from repos who's tests are failing into group0, if they want to even get that far they have to have their basic tests passing.
    • This is basically the requirement that we need if we start to allow some projects to be post-merge reviewed. So, really, the requirement is: all your tests passing AND all code is reviewed.
  • Then we pull in those green repos into group0
  • Run the end-to-end tests as before
  • Figure out a way of pulling only those who don't break the end-to-end tests into group1 and 2