QA Support for Deployment Train

The Deployments page sets the working framework for QA workflow when it comes to deployment. Although the details and the level of QA work might be different from team to team depending on a specific team's workflow, specific projects and, even, specific phases of a project, the below is an outline of what seems to be, in a sense, typical QA activities for a weekly deploy cadence.

Beta cluster - the start

A patch's journey to production starts with beta cluster (see more details on what beta cluster is here. For an example of patch lifecycle (simplified, of course), let's take a look at the following patch - GrowthExperiments patch (and the corresponding phab task is T287636).
On the high level, the sequence of events that move any patch to be deployed to production is the following:
patch is not mergedpatch is mergedpatch is deployed to production

Some details on the steps:
(1) patch is not mergedThe first screenshot shows that there is no word "merged". It means that the patch cannot be tested in betalabs yet. The options to test a patch are to set up your local environment and checkout a patch or use patch demo.
(2) patch is merged Now the patch is merged - the word "merged" is present on a phab task section. At this point (or some time after) the related phab task may display the production version to which the patch would be deployed. The gerrit link shows that the patch was "included" (merged) to "master", i.e. a fix or a feature is now on the beta cluster and can be checked there.
(3) patch is deployed to production Finally, a patch is deployed to production - fixes or features can be checked on the production version (the tag shows the exact version). And for this specific example, it shows that the patch was "cherry-picked", i.e. it was deployed during one of the Backport windows

(1) patch is not merged
Screen Shot 2021-07-31 at 8.59.17 AM.png (464×2 px, 105 KB)
(2) patch is merged
Screen Shot 2021-07-31 at 1.53.58 PM.png (426×2 px, 94 KB)
Screen Shot 2021-07-31 at 9.00.07 AM.png (1×3 px, 414 KB)
Screen Shot 2021-07-31 at 2.28.54 PM.png (762×2 px, 270 KB)
(3) patch is deployed to production
Screen Shot 2021-07-31 at 8.59.41 AM.png (1×3 px, 465 KB)

There might be times when it's useful to see all patches that would be (or were) deployed. Adding deployment and project tag(s) to Phabricator Advanced Search form would display all phab tasks that have patches associated with those tags.

Screen Shot 2021-07-31 at 3.19.58 PM.png (611×1 px, 140 KB)

Another option is to filter gerrit for a production version and an extension, e.g. GrowthExperiment patches for wmf.16.
One more note on testing in betalabs - there is a cutoff time before production deployment. In many cases it's worth to check features and bug fixes after the cutoff time to ensure that the deployment to production won't bring surprises.

Deployment cadence

The weekly deployment cadence, i.e. the weekly deployment schedule (see Deployments), presents detailed information on what and when would be deployed. The schedule lists the times for the deployment train events and backport windows:

Screen Shot 2021-07-31 at 2.58.39 PM.png (1×2 px, 342 KB)

The first of the two links that are marked "Imported links" on the screenshot above points to the list of all patches (grouped by extensions) included in MediaWiki 1.37/wmf.17. The second link points to the phab task Blockers: task T281158 where the issues blocking the deployment train would be reported. Another helpful page to see the deployments updates is Wikimedia MediaWiki versions. The screenshot below shows that right now all groups have wmf.16 version:
Screen Shot 2021-07-31 at 4.29.27 PM.png (1×3 px, 420 KB)

Based on the above, my own QA deployment schedule would look like this:

Monday - Checking betalabs Tuesday - group 0 Checking testwiki Wednesday - group 1 Checking specific wikis - Commons, Hebrew, CatalanThursday - group 2 Checking what needs to be checked
Monitor logstash (see the note below)Monitor logstash (see the note below)Monitor logstash (see the note below)

Note: Monitor logstash Growth team has a workflow - Growth Team chores. If one of the deployment days is my chore day, then on that day I'll do more thorough monitoring of logstash (and other activities).

Monday - Checking betalabs

  • check after a cutoff time (if needed).
  • watch for regression and review new features/fixes that will be deployed to testwiki
  • check that all features/fixes that are planned for deployment are included in the train (the phabricator tag is in place)

Tuesday - group 0. Checking testwiki
Why is it important to test testwiki?

  • the obvious: it's the first point for production deployment; good to check for feature deployments and for any regression issues
  • more user rights can be assigned to my test users than on usual wikis
  • I can create/modify content on testwiki without impact on actual users.
  • sometimes features get deployed to testwiki a week or more before being deployed to actual wikis to give more time for production-like testing or getting feedback from ambassadors|

Wednesday - group 1. Check specific wikis - Commons, Hebrew, Catalan
The testing on group 1 wikis is limited. I check:

  • Commons - I work with Structured Data team and checking in production is a vital part of their team workflow.
  • Catalan wikipedia - because it has Flow features.
  • Hebrew wikipedia - for two reasons: 1) it is the first of RTL (right-to-left language) wikipedias to be deployed and 2) it has Growth Experiments features deployed on it.

Thursday - group 2. All remaining wikipedias are deployed. Check whatever needs to be checked.

Each team (and QA, of course) should evaluate the risks of deploying features and implement specific monitoring measures. The issues found in production fall into the category of "unplanned work", as it was dramatically defined in The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win (by Gene Kim etc) :

Like matter and antimatter, in the presence of unplanned work, all planned work ignites with incandescent fury, incinerating everything around it. [...]
Unlike the other categories of work, unplanned work is recovery work, which almost always takes you away from
your goals. That’s why it’s so important to know where your unplanned work is coming from.

Having good QA processes on planned work (feature testing, regression testing, and testing in production), the scorching effects of unplanned work, i.e. finding bugs late in production, will be kept to a minimum.

Happy testing!

Written by Etonkovidova on Aug 1 2021, 2:55 AM.
"Insectivore" token, awarded by zeljkofilipin."Love" token, awarded by dancy."Love" token, awarded by mmodell."Like" token, awarded by thcipriani.

Event Timeline