Jenkins: Set up perceptual diffs (visual regression testing)
Open, Stalled, LowestPublic
Actions

Assigned To

None

Authored By

	Krinkle
	Mar 14 2014, 4:05 AM

Description

This is something I've been experimenting with in spare time for a while.

The idea is: Render a visual diff for one or more pages and states thereof (comparing the result of the current change to the result of the latest master, or whatever target branch the commit has).

The screenshots would be created using PhantomJS' render() API. Or, while we don't need cross-browser per se, having a browser more representative than PhantomJS would be nice. Perhaps using Chromium under Xvfb with a large enough window (we don't need it to be very tall). And capture output from Xvfb using ImageMagic display import.

Rough idea for the Jenkins job:

Run project setup (e.g. build script for projects like OOjs UI and VisualEditor; installing MediaWiki for core/extensions). Then expose workspace to the local web server (We've got re-usable macros for this already).

Run the scenarios or urls for the current project and capture the screen after each scenario.

Compare them against the ones from the last run (e.g. for a commit to master, compare them to the latest master build). TODO: Will need to be stored somewhere. Shared NFS maybe? store/{project}/{branch}.

In test pipeline:
- If different, make sure the latest.png/change-after.png/change-diff.png for that url is kept and stored as build artefacts in Jenkins. Otherwise delete the image.
In the post-merge pipeline:
- Replace the images in the store with those of this build.

Scenarios:

I imagine we'll need to support two kinds of scenarios:

Plain url.
Web driver steps (for large interfaces not accessible by url). This should *not* be used to trigger every possible dialog and component, that slows the test matrix and only tests for no reason. More useful would be to capture individual components via e.g. the OOjs UI demo page. Use these two assert the composition rather.

A few urls we might want:

mediawiki-core:
- /index.php?title=Main_Page
- /index.php?title=Main_Page&useskin=monobook
- /index.php?title=Main_Page&action=edit
- /index.php?title=Main_Page&action=history
- /index.php?title=Special:UserLogin
- /index.php?title=Special:UserLogin/signup
- /index.php?title=Special:Search&search=wiki
VisualEditor:
- /demos/ve/#!/src/pages/empty.html
- /demos/ve/#!/src/pages/simple.html
- /demos/ve/#!/src/pages/complex.html
oojs-ui:
- /demos/icons.html
- /demos/widgets.html

A few implementations that exist:

https://github.com/uber/image-diff/
https://github.com/bslatkin/dpxdt/
Talk (Velocity 2013): https://www.youtube.com/watch?v=1wHr-O6gEfc
Talk (Google Developers): https://www.youtube.com/watch?v=UMnZiTL0tUc

Behind these is basically just a ImageMagick compare command between two PNGs.

`
compare
      -verbose
      -metric RMSE
      -highlight-color RED
      -compose Src
      mytest-latest.png
      mytest-build.png
      mytest-diff.png

Details

Reference: bz62633

Related Objects
Search...

Status	Assigned	Task
Declined	None	T101542 [EPIC] Provide pre-merge reports on patchsets (tracking)
Stalled	None	T64633 Jenkins: Set up perceptual diffs (visual regression testing)
Declined	None	T101545 Provide infrastructure to store files by project/branch post-merge to compare with pre-merge
Declined	None	T114998 Provide Swift object store(s) for the labs projects
Declined	zeljkofilipin	T90884 Investigate using the sikuli-like Applitools framework for visual testing
Resolved	ssastry	T110715 Visual diff testing should not use parsoid-lb.eqiad

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:09 AM

• bzimport added a project: Continuous-Integration-Infrastructure.

• bzimport set Reference to bz62633.

• bzimport added a subscriber: Unknown Object (MLST).

Krinkle created this task.Mar 14 2014, 4:05 AM

The job would be non-voting of course, and we'd change the jenkins-bot comment to Gerrit to CHANGED/UNCHANGED instead of SUCCESS/FAILURE.

I assume that it's easier to roll our own than to re-use wraith or something similar, due to trying to integrate it into jenkins? Multi-platform screenshot regression testing is probably a secondary-level target, but…

jgonera wrote:

Mentionned by Subbu on IRC: an example of what visual diffs can bring to us http://diplograph.net/posts/visual_diffs

See https://github.com/subbuss/parsoid_visual_diffs for a version that is being used to compare Parsoid and PHP parser HTML output. That works quite well and has already exposed a few css issues and other non-css rendering/html diffs.

Something similar could perhaps be adapted for this purpose as well?

Lowering priority from high to normal since nobody is apparently actively pushing for this change. Whenever the feature teams figure out a good utility / way to do such visual differences we can work on integrating it on Jenkins/Zuul.

Krinkle lowered the priority of this task from Medium to Lowest.Jan 8 2015, 1:26 PM

Krinkle set Security to None.

Krinkle removed a subscriber: Unknown Object (MLST).

Krinkle updated the task description. (Show Details)Mar 6 2015, 7:14 PM

Krinkle updated the task description. (Show Details)Mar 6 2015, 7:26 PM

Krinkle moved this task from Untriaged to Backlog on the Continuous-Integration-Infrastructure board.Apr 10 2015, 12:54 PM

greg mentioned this in T101154: Evaluate the value of 'Visual Diff' tools for CI / Deployment smoke testing.Jun 2 2015, 8:50 PM

• mmodell merged a task: T101154: Evaluate the value of 'Visual Diff' tools for CI / Deployment smoke testing.Jun 2 2015, 9:35 PM

• mmodell added subscribers: Aklapper, • mmodell.

Jdforrester-WMF added a subtask: T101545: Provide infrastructure to store files by project/branch post-merge to compare with pre-merge.Jun 8 2015, 10:56 PM

Jdforrester-WMF added a parent task: T101542: [EPIC] Provide pre-merge reports on patchsets (tracking).

hashar added a subtask: T90884: Investigate using the sikuli-like Applitools framework for visual testing.Jun 9 2015, 12:48 PM