Page MenuHomePhabricator

Make visual regression tests run in CI (non-blocking) for the Vector repo
Open, MediumPublic

Description

The easiest way to use visual regression tests is to have a computer run them for you automatically. This task proposes adding a non-blocking CI job to the Vector repo that runs on each patch set made to Vector. If that goes well, other repos could be added.

It would run something like the following which would compare master against the change on top of master:

./pixel.js reference && ./pixel.js -c <Change-Id of Gerrit Patch>

The preceding command launches several Docker containers (e.g. visual regression test container, a container with MediaWiki, a container with the database) and runs the visual regression tests. If that command exits with 0, the job passes. If it exits with something else, it fails. However, failures shouldn't block merging and should only be seen as flags that need discretionary review.

Pixel already does a similar command for commits made to it.

Acceptance Criteria

  • For each Gerrit patch set made to Vector, CI runs the command above which takes a visual diff of master against the change/changes on top of master.
  • The job should provide a publicly accessible link to the report generated (e.g. see https://pixel.wmcloud.org/desktop/ for example) where one can review the screenshots taken.
  • If the change includes dependencies (Depends-On) that Pixel doesn't support, the job can be skipped.

Event Timeline

nray updated the task description. (Show Details)
nray updated the task description. (Show Details)
hashar added a subscriber: hashar.

For each Gerrit patch set made to Vector, CI runs a command that takes a visual diff of master against that change/changes on top of master.

Although our current CI is terribly dated and legacy, we have all the logic to do that. The flow is roughly:

  • Zuul listens for Gerrit events and triggers jobs
  • The jobs are defined in jobs in Jenkins
  • A job run a Docker image which has all the execution environment (MariaDB, php, composer nodejs, npm etc)
  • The image runs Quibble the MediaWiki test runner https://doc.wikimedia.org/quibble/ which holds the logic to run things

It is easy for us to add a new job which would npm run pixel for example so most of the logic would be delegated to the repo / the npm package.

We have a few use cases doing comparison with the parent of the change being tested. There are different ad hoc implementations though. Usually that boils down to something like:

  • run the command against the patch saving the result to an artifacts directory
  • checkout the parent commit (HEAD@^1)
  • run the command
  • do the diff

Order might vary ;)

The job should provide a publicly accessible link to the report generated where one can review the screenshots taken.

Upon completion of the job, the CI system will report back to Gerrit with a link to the job :)

If the change includes dependencies that Pixel doesn't support, the job can be skipped.

Not quite sure about this one.

Thank you for your response @hashar. I have a follow up question below (I'm also happy to meet with you if that's easier):

Although our current CI is terribly dated and legacy, we have all the logic to do that. The flow is roughly:

  • Zuul listens for Gerrit events and triggers jobs
  • The jobs are defined in jobs in Jenkins
  • A job run a Docker image which has all the execution environment (MariaDB, php, composer nodejs, npm etc)
  • The image runs Quibble the MediaWiki test runner https://doc.wikimedia.org/quibble/ which holds the logic to run things

It is easy for us to add a new job which would npm run pixel for example so most of the logic would be delegated to the repo / the npm package.

Our visual regression tool is currently a CLI that launches a set of Docker containers extended from MediaWiki-Docker (e.g. there is a Docker container to run Mediawiki with the LocalSettings.php we want, a Docker container for MariaDB with seed data that the tests use, etc. Is it possible to launch these Docker containers within our CI job? The process to use the tool would look something like this:

  1. npm install -g pixel
  2. pixel reference && pixel test --change <Change-Id of Gerrit Patch> --output <path-to-report> . These commands launch the Docker containers, takes screenshots of the page before and after the change, and outputs a static report directory that contains an index.html/screenshots of the results (e.g. see https://pixel.wmcloud.org/desktop/index.html). I was envisioning this report folder would be the artifact produced from the job.