[GOAL] Leverage Automated Visual Regression Testing
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	SCherukuwada
	Feb 21 2022, 8:51 PM

Description

Goal definition

Specific: What do we want to achieve?

Automated UI regression testing alerting us when regressions have occurred.

Measurable: How will we know when we've reached our goal?

We will know we're done when the team is notified bout our first UI regression (either by an intentionally introduced error to prove it works or an accidental one)

Achievable: What support will we need to achieve our goal?

Should be answered by T304634.

Relevant: Is this goal worthwhile?

Yes. Automated UI regression testing will catch many more important issues than manual testing. This will reduce regressions and reduce friction between teams due to bugs.

Background

There is a complex interplay between Mediawiki core code, skins, gadgets, extensions, and other customizations that could sometimes make it difficult to anticipate the full extent of the effect of a change to code maintained by the Foundation. Some gadgets and CommonJS snippets live outside of our codebases but are tightly coupled to the behaviour of code in Mediawiki. This sometimes leads to seemingly innocuous changes eventually resulting in visible problems when rolled out to all the Wikis, in spite of due diligence and manual testing before release.

Hypothesis

At some point, given the number of people contributing code to Mediawiki and the complexity of all the mechanisms that allow customization, we need mechanisms other than due diligence and manual testing to ensure that we aren’t unknowingly breaking any significant functionality. The proposal in this document is predicated on this hypothesis being true. One method that might help us detect visual regressions earlier and more efficiently is automated visual regression testing.

Technical Problems to Solve

Visual regression testing at its very minimum, compares a specified version (likely a release candidate, or a git client with a feature under development, let's call it the "test version") with another version as a baseline (likely a previous release candidate or whatever is running in production), runs through certain scenarios, and reports on any visual differences spotted between the baseline and the test version. While this basic premise is simple enough, some details deserve more attention.

Running at Scale

The most important objective of a visual regression test is to assure the initiator of the test that their release is unlikely to break any features when it gets launched. To this end, the test needs to run at scale - on potentially hundreds of pages, as per the desired scale and the modality of the test. A simple way of getting started here would be to use Chrome and Selenium Webdriver.

Reducing Noise

One problem often encountered while running visual regression tests at scale is noise. This could be due to many different reasons, the most common of which are a) low tolerances and b) lack of a de-duplication of reported visual regressions. (a) occurs in cases where small, almost imperceptible differences such as antialiasing or font kerning settings end up being reported as visual regressions. Setting up tolerances is usually helpful in this case. A single change causing a large number of regressions sometimes produces hundreds of failures, making it tiresome to sift through them all. Having a mechanism to de-duplicate and group these regressions by possible cause will make it significantly easier for a test runner to properly understand the report and take action. Using an impage comparison library would be a first step to solving tolerances.

Hermetic Tests

To ensure that the tests are hermetic, a sufficiently complete environment including production data and configuration needs to be spun up for the test and the baseline instances of Mediawiki and its extensions. Patchdemo might have ideas worth learning from.

Production Data

We need to ensure that a sufficiently large corpus of production data is copied out into the database used for running the regression. This will need to include complete pages as well as the images and other media shown within.

Production Config

To ensure that the production environment is replicated as accurately as possible, it might also be necessary to find the LocalSettings.php and other configuration files will need to somehow be replicated.

Related Objects
Search...

Status	Subtype	Assigned	Task
Resolved		• nray	T302246 [GOAL] Leverage Automated Visual Regression Testing
Resolved	Spike	Jdlrobson	T304634 [Spike] Determine a general path of least resistance, work, and effort necessary to test the viability of Visual Regression Testing
Resolved		Jdrewniak	T305563 Deploy visual regression MVP to Wikimedia Cloud VPS
Declined	Spike	None	T305662 [Spike] Can we use the mediawiki-config repo for our Visual Regression test environment?
Declined	Spike	None	T305664 [Spike] Can we use production data for our Visual Regression test environment?
Resolved		ovasileva	T306229 Prevent the edit, history, more tabs from dancing at lower resolutions
Declined		Jdrewniak	T307732 Show Edit, History, and Watch tabs at narrow widths
Resolved		• nray	T306401 Run visual regression tests on every push to Pixel's main branch or pull request
Resolved		• nray	T306405 Add unit tests to visual regression repo
Resolved		• nray	T306404 Revise the installed/enabled extensions that our visual regression tests use
Resolved		Jdlrobson	T306846 Add visual regression tests for the typeahead search component
Declined		• nray	T307936 Add versioning/packaging to visual regression tests
Resolved		• nray	T307940 Remove previous test screenshots folder when running `./pixel.js test`
Resolved		• nray	T306731 Change visual regression test repo ownership to wikimedia
Duplicate		None	T308194 Make visual regression tests run in CI (non-blocking) for the Vector repo

Event Timeline

SCherukuwada created this task.Feb 21 2022, 8:51 PM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 21 2022, 8:51 PM

SCherukuwada added a project: Web-Team-Backlog.Feb 21 2022, 8:52 PM

Jdlrobson moved this task from Incoming to Needs Prioritization (Tech) on the Web-Team-Backlog board.Feb 22 2022, 4:16 PM

Jdlrobson edited projects, added Web-Team-Backlog (Needs Prioritization (Tech)); removed Web-Team-Backlog.

Jdlrobson moved this task from Backlog to UI Regression Testing on the Web-Team-Backlog (Needs Prioritization (Tech)) board.

ovasileva subscribed.Mar 22 2022, 10:53 AM

I've asked @nray to lead this technical initiative. I am hoping that we'll be able to work on it between now and end of May.

Jdlrobson assigned this task to • nray.Mar 23 2022, 12:02 AM

• nray mentioned this in T304634: [Spike] Determine a general path of least resistance, work, and effort necessary to test the viability of Visual Regression Testing.Mar 24 2022, 5:47 PM

• nray added a subtask: T304634: [Spike] Determine a general path of least resistance, work, and effort necessary to test the viability of Visual Regression Testing.Mar 24 2022, 5:50 PM

Jdlrobson updated the task description. (Show Details)Mar 24 2022, 8:10 PM

Jdlrobson renamed this task from Leverage Automated Visual Regression Testing to [GOAL] Leverage Automated Visual Regression Testing.Apr 5 2022, 7:13 PM

Jdlrobson edited projects, added Web-Team-Backlog (Kanbanana-FY-2021-22); removed Web-Team-Backlog (Needs Prioritization (Tech)).

Jdlrobson moved this task from Needs Analysis to Quarterly Goals on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.

• nray added a subtask: T305563: Deploy visual regression MVP to Wikimedia Cloud VPS.Apr 7 2022, 5:23 PM

• nray added a subtask: T305662: [Spike] Can we use the mediawiki-config repo for our Visual Regression test environment?.Apr 7 2022, 10:53 PM

• nray added a subtask: T305664: [Spike] Can we use production data for our Visual Regression test environment?.Apr 7 2022, 11:36 PM

Jdlrobson closed subtask T304634: [Spike] Determine a general path of least resistance, work, and effort necessary to test the viability of Visual Regression Testing as Resolved.Apr 8 2022, 7:48 PM

• nray added a subtask: T306229: Prevent the edit, history, more tabs from dancing at lower resolutions.Apr 15 2022, 12:20 AM

• nray closed subtask T305662: [Spike] Can we use the mediawiki-config repo for our Visual Regression test environment? as Declined.Apr 18 2022, 5:12 PM

• nray closed subtask T305664: [Spike] Can we use production data for our Visual Regression test environment? as Declined.

• nray added a subtask: T306401: Run visual regression tests on every push to Pixel's main branch or pull request.Apr 18 2022, 10:54 PM

• nray added a subtask: T306405: Add unit tests to visual regression repo.Apr 18 2022, 11:50 PM

• nray added a subtask: T306404: Revise the installed/enabled extensions that our visual regression tests use.Apr 18 2022, 11:56 PM

• nray closed subtask T306405: Add unit tests to visual regression repo as Resolved.Apr 25 2022, 9:20 PM

• nray closed subtask T306401: Run visual regression tests on every push to Pixel's main branch or pull request as Resolved.

• nray changed the status of subtask T306404: Revise the installed/enabled extensions that our visual regression tests use from Open to Stalled.Apr 25 2022, 9:27 PM

• nray added a subtask: T306846: Add visual regression tests for the typeahead search component.Apr 25 2022, 10:36 PM

• nray added a subtask: T307936: Add versioning/packaging to visual regression tests.May 9 2022, 3:53 PM

• nray closed subtask T306404: Revise the installed/enabled extensions that our visual regression tests use as Resolved.

• nray added a subtask: T307940: Remove previous test screenshots folder when running `./pixel.js test`.May 9 2022, 4:48 PM

• nray added a subtask: T306731: Change visual regression test repo ownership to wikimedia.May 9 2022, 6:09 PM

Jdlrobson closed subtask T305563: Deploy visual regression MVP to Wikimedia Cloud VPS as Resolved.May 11 2022, 12:21 AM

Jdlrobson mentioned this in T64633: Jenkins: Set up perceptual diffs (visual regression testing).May 11 2022, 12:23 AM

Jdlrobson mentioned this in T291525: Consider automated visual regression testing in the new Vue component library.May 11 2022, 12:27 AM

ovasileva added a project: Web Team Visual Regression Framework.May 11 2022, 7:43 AM

kostajh subscribed.May 11 2022, 12:16 PM

• nray added a subtask: T308194: Make visual regression tests run in CI (non-blocking) for the Vector repo.May 11 2022, 10:12 PM

pwangai subscribed.May 12 2022, 5:57 PM

• nray closed subtask T307940: Remove previous test screenshots folder when running `./pixel.js test` as Resolved.May 24 2022, 8:00 PM

zeljkofilipin added a project: User-zeljkofilipin.May 27 2022, 2:46 PM

zeljkofilipin moved this task from Backlog 🪒 to Q2 👔 on the User-zeljkofilipin board.

zeljkofilipin subscribed.

ovasileva closed subtask T306229: Prevent the edit, history, more tabs from dancing at lower resolutions as Resolved.May 31 2022, 10:55 AM

Jdlrobson moved this task from Quarterly Goals to Ready for Signoff on the Web-Team-Backlog (Kanbanana-FY-2021-22) board.May 31 2022, 5:41 PM

We chatted about this goal today and realized we've met the "measurable" section. @nray will create a new goal card capturing remaining work and what we want to achieve.