Page MenuHomePhabricator

Run browser tests in parallel
Open, Needs TriagePublic

Description

(Apologies if this is a duplicate, I know there's been some discussion already but couldn't find it in Phabricator.)

Browser tests are slow, and there are a few opportunities for parallelization. The easiest step would be to run our QUnit and node-selenium tests in separate threads. After T199116 we could also run selenium tests for each repo in parallel, but this introduces the additional challenge of potential interaction between tests.

However, there are some questions to work through first:

  • Can our test webserver handle multiple clients? (See also T225218.)
  • Is it reasonable to run multiple chromedrivers on one machine? How much more memory will this require?
  • How many of our tests will become fragile, for example because they depend on a constant rather than randomly-generated title?

Related Objects

StatusSubtypeAssignedTask
OpenNone
ResolvedFeaturekostajh
ResolvedNone
OpenNone
OpenNone
OpenNone
OpenNone
ResolvedNone
ResolvedNone
Resolvedawight
Resolvedkostajh
OpenNone
Resolvedcscott
Resolvedkostajh
OpenNone
Resolvedkostajh
Resolvedhashar
OpenNone
OpenPRODUCTION ERRORNone
ResolvedLucas_Werkmeister_WMDE
ResolvedNone
ResolvedDreamy_Jazz
OpenNone
Openhashar
OpenNone
OpenNone
OpenNone
StalledNone
OpenNone
OpenNone
ResolvedEBernhardson
ResolvedEBernhardson
DeclinedNone
Resolvedhashar
DeclinedNone
OpenNone
DeclinedNone
ResolvedOsamaahmed17
ResolvedNone
ResolvedNone
DuplicateBUG REPORTzeljkofilipin
ResolvedOsamaahmed17
Resolvedzeljkofilipin
ResolvedNone
ResolvedNone
ResolvedKrinkle
OpenNone
ResolvedOllie.Shotton_WMDE
ResolvedJakob_WMDE
OpenNone
ResolvedOsamaahmed17
Resolvedvaughnwalters
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedkostajh
ResolvedOsamaahmed17
ResolvedOsamaahmed17
OpenNone
Resolvedpwangai
ResolvedWMDE-Fisch
ResolvedNone
ResolvedOsamaahmed17
ResolvedGehel
Resolvedpwangai
ResolvedNone
Invalidpwangai
ResolvedNone
InvalidNone
ResolvedOsamaahmed17
Resolvedvaughnwalters
Resolvedpwangai
ResolvedNone
InvalidNone
ResolvedOllie.Shotton_WMDE
ResolvedDreamy_Jazz
OpenNone
OpenNone
DuplicateNone
DeclinedNone
DeclinedNone
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedzeljkofilipin
ResolvedOsamaahmed17
OpenNone
OpenNone
ResolvedMhmohona
OpenNone
ResolvedAghaSaad04
ResolvedMhmohona
ResolvedLykarungi
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedzeljkofilipin
Resolvedzeljkofilipin
OpenNone
OpenNone
Resolvedzeljkofilipin
Resolvedzeljkofilipin
OpenNone
ResolvedOsamaahmed17
OpenNone
ResolvedOsamaahmed17
OpenNone
ResolvedReedy
OpenNone
OpenNone
OpenNone
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Restricted Application added a subscriber: Aklapper. ยท View Herald TranscriptJun 28 2019, 8:58 PM

Is it reasonable to run multiple chromedrivers on one machine? How much more memory will this require?

I didn't test this, but I'm pretty sure we could reuse one chromedriver for all tests.

How many of our tests will become fragile, for example because they depend on a constant rather than randomly-generated title?

Every test should make sure it's dependencies (users, pages...) are randomly generated and not reused.

WebdriverIO comes with built-in support for running spec files in parallel, up to a configured concurrency. This is tuned with the wdio.conf.js maxInstances setting, which we currently have set to 1. Considering all the things that can go wrong when introducing concurrency here, we should expose the tuning knob to allow for experimentation. For example, we check an environment variable which can be poked through from the Jenkins job.

Meanwhile, I'll experiment with concurrent browser tests in a few repos, to see what obstacles we might run up against.

Change 545650 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Wikibase@master] [DNM] Experiment with concurrent browser tests

https://gerrit.wikimedia.org/r/545650

Change 545653 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/core@master] [DNM] Experiment with concurrent browser tests

https://gerrit.wikimedia.org/r/545653

Change 545650 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Wikibase@master] [DNM] Experiment with concurrent browser tests

https://gerrit.wikimedia.org/r/545650

This came surprisingly close, only a handful of test cases failed and we might be able to code around the fragility, for example:

22:11:22 1) WikibaseReferenceOnProtectedPage can expand collapsed references on a protected page as unprivileged user:
22:11:22 cantedit: You cannot change the protection levels of this page because you do not have permission to edit it.

... probably due to test cases presuming that side-effects from previous tests will have taken place.

There were no signs of memory exhaustion.

The equivalent patch for mediawiki-core passes :-)

The bigger gain will come when parallelizing at the next level up, however: running suites in parallel.

We should be able to use a single backend DevWebServer, and a single Chrome driver. But since we've isolated each repo's tests in order to decouple dependencies, we can't use WebdriverIO's built-in concurrency. I'll play with Quibble parallelism, executing some configurable maximum number of suites at once.

Change 545661 had a related patch set uploaded (by Awight; owner: Awight):
[integration/quibble@master] [WIP] Run browser suites in parallel

https://gerrit.wikimedia.org/r/545661

Change 545661 abandoned by Awight:
Prepare to run browser suites in parallel

Reason:
Please see Ib2dc728980ce95 instead.

https://gerrit.wikimedia.org/r/545661

Change 545653 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/core@master] [DNM] Experiment with concurrent browser tests

Reason:

This is in conflict right now. It's a very trivial change that can be redone any time, if needed.

https://gerrit.wikimedia.org/r/545653

Change 545650 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/extensions/Wikibase@master] [DNM] Experiment with concurrent browser tests

Reason:

This is in conflict right now. It's a very trivial change that can be redone any time, if needed.

https://gerrit.wikimedia.org/r/545650

Change 738061 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] Browser tests: run npm install in parallel

https://gerrit.wikimedia.org/r/738061

Change 738061 merged by jenkins-bot:

[integration/quibble@master] BrowserTests: Option to run npm install in parallel

https://gerrit.wikimedia.org/r/738061

Change 751761 had a related patch set uploaded (by Hashar; author: Hashar):

[mediawiki/core@master] selenium: run 4 tests in parallel

https://gerrit.wikimedia.org/r/751761

Change 545650 restored by Hashar:

[mediawiki/extensions/Wikibase@master] [DNM] Experiment with concurrent browser tests

https://gerrit.wikimedia.org/r/545650

Change 751761 abandoned by Hashar:

[mediawiki/core@master] selenium: run 4 tests in parallel

Reason:

I will restore https://gerrit.wikimedia.org/r/c/mediawiki/core/+/545653

https://gerrit.wikimedia.org/r/751761

Change 545653 restored by Hashar:

[mediawiki/core@master] [DNM] Experiment with concurrent browser tests

https://gerrit.wikimedia.org/r/545653

Change 751767 had a related patch set uploaded (by Hashar; author: Hashar):

[mediawiki/skins/MinervaNeue@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/751767

As of this morning, CI exposes MediaWiki through Apache which allow concurrent requests (T285649) we should thus be able to get webdriver.io to run tests in parallel. I have restored and rebased some patches made by @awight and added one for MinervaNeue which has more than a few tests:

Change 747904 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] BrowserTests: Support splitting projects into two groups

https://gerrit.wikimedia.org/r/747904

Change 751767 merged by jenkins-bot:

[mediawiki/skins/MinervaNeue@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/751767

@kostajh If I understand correctly, this last patch would split at the job/quibble level, and thus not share a common db or web server. That's fine, I think, but that approach wouldn't have been blocked on using Apache I think. Can you confirm that?

Are we (also) considering letting wdio run chromium sessions concurrently? I imagine that may be tricky to some extent as tests could interfere with each other on the same server/db esp when involving a shared concept like RC feed, user prefs, or page title. But maybe we have alternate strategies for tests that rely on that, eg mandating new titles/users for everything always, and for RC feed perhaps check lower items and not just the last and hope there's not too many. This is somewhat different in direction compared to the direction of using more stable/deterministic titles in tests and using setup/tear down scripts, which is also in our backlog. That could potentially be addressed differently if we departed from testing an existing test/dev wiki and instead created an ad-hoc db+localsettings within the test runner, and/or formalise what we do with quibble and a lighter and more reusable way and discourage local bare running. Food for thought :-)

@kostajh If I understand correctly, this last patch would split at the job/quibble level, and thus not share a common db or web server. That's fine, I think, but that approach wouldn't have been blocked on using Apache I think. Can you confirm that?

Yes, that's right โ€“ https://gerrit.wikimedia.org/r/747904 is not dependent on Apache. Maybe it should be tagged with a different task to clear up confusion. There's a few complementary approaches being proposed here:

  • within the existing Selenium job (e.g. https://integration.wikimedia.org/ci/job/wmf-quibble-selenium-php72-docker/129061/console), run core + skin + extension selenium tests in parallel, using Quibble's ParallelCommand utility. If this "just worked", that would be nice, but I suspect it won't be easy for any number of reasons โ€“ there's a single chromedriver instance, tests messing up state from other tests, etc.
  • within the existing Selenium job, for each core/skin/extension test suite, parallelize the execution of tests in the suite (i.e. each it() statement). That will also surface some problems where test methods interfere with one anotehr when querying RC feed, prefs, page titles, etc as you noted. It also probably doesn't bring a big performance improvement, but it would be nice if it did :)
  • create two new Selenium jobs, group A and group B, where e.g. group A runs Abusefilter -> GrowthExperiments tests, and group B runs MobileFrontend -> Wikibase. That has the downside of requiring more VM resources to run an additional job, and also some work is done twice (Zuul clone, npm install), but has the benefit of cutting down the overall time spent on Selenium for any given patch by several minutes. In theory we could have a job that does the initial Zuul clone and npm install that group A and group B jobs then both make use of to save more time.

FYI:

Change by Awight merged:

[mediawiki/core] selenium: run 4 tests in parallel

https://gerrit.wikimedia.org/r/545653

Change 751940 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/AbuseFilter@master] selenium: Run tests in each suite concurrently

https://gerrit.wikimedia.org/r/751940

Change 751942 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/WikibaseLexeme@master] selenium: Run tests in each suite concurrently

https://gerrit.wikimedia.org/r/751942

Change 751957 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/GrowthExperiments@master] selenium: Run test suites concurrently

https://gerrit.wikimedia.org/r/751957

Change 751959 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/core@master] selenium: Run test suites concurrently

https://gerrit.wikimedia.org/r/751959

Change 751961 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/core@master] api-testing: Run tests in parallel

https://gerrit.wikimedia.org/r/751961

Change 751940 merged by jenkins-bot:

[mediawiki/extensions/AbuseFilter@master] selenium: Run test suites concurrently

https://gerrit.wikimedia.org/r/751940

Change 751959 abandoned by Kosta Harlan:

[mediawiki/core@master] selenium: Run test suites concurrently

Reason:

https://gerrit.wikimedia.org/r/751959

Change 754524 had a related patch set uploaded (by Hashar; author: Hashar):

[integration/quibble@master] Release Quibble 1.3.0

https://gerrit.wikimedia.org/r/754524

Change 754524 merged by jenkins-bot:

[integration/quibble@master] Release Quibble 1.3.0

https://gerrit.wikimedia.org/r/754524

Change 768354 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] [WIP] Run Selenium tests for each project in parallel

https://gerrit.wikimedia.org/r/768354

Change 768354 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/quibble@master] [WIP] Run Selenium tests for each project in parallel

https://gerrit.wikimedia.org/r/768354

This is not a million miles off. Some observations:

  1. The Apache config needs tuning, as the "Server unavailable" error shows up a couple of times:

image.png (360ร—1 px, 103 KB)

  1. Some tests fail on first attempt due to interference of other tests (or due to the 503s noted above), but then they succeed due to the automatic retries in wdio. Not ideal, but it mostly seems to work out.
  2. Some tests need to be reworked slightly to be more robust. E.g. one of the Echo tests assumes the account isn't logged-in, but it is.
  3. Some tests simulate blocking a user and then other tests which use the same user fail, because the user is blocked (https://integration.wikimedia.org/ci/job/integration-quibble-fullrun-extensions/75/artifact/log/Lexeme%253AForms-can-edit-statements-on-a-new-Form-2022-03-06T21-57-01-255Z.mp4)

@zeljkofilipin I was wondering if we could consider using a single wdio.conf.js in mediawiki core that each extension references. That way, we'd get WDIO's parallelization for free, as it already supports concurrent test running, if it knows about all the tests that need to be run.

Core's wdio.conf.js file would need to be adjusted to look in extensions/skins:

	specs: [
		'./tests/selenium/specs/**/*.js',
                './extensions/**/tests/selenium/specs/**/*.js'
	],

and each extension/skin would have to migrate its wdio.conf.js contents into before/beforeEach methods in specFiles, rather than in wdio.conf.js

Having a single configuration file for MW core + extensions/skins seems more akin to how we do things for PHPUnit.

@zeljkofilipin I was wondering if we could consider using a single wdio.conf.js in mediawiki core that each extension references. That way, we'd get WDIO's parallelization for free, as it already supports concurrent test running, if it knows about all the tests that need to be run.

Core's wdio.conf.js file would need to be adjusted to look in extensions/skins:

	specs: [
		'./tests/selenium/specs/**/*.js',
                './extensions/**/tests/selenium/specs/**/*.js'
	],

and each extension/skin would have to migrate its wdio.conf.js contents into before/beforeEach methods in specFiles, rather than in wdio.conf.js

Having a single configuration file for MW core + extensions/skins seems more akin to how we do things for PHPUnit.

Note that if we do this, some Quibble changes would be necessary (cc @hashar) because Quibble currently assumes a wdio specific config for each extension/skin that contains a list of tests to run for just that extension/skin. And that's how Quibble knows, for example, to first run tests for the extension/skin under test. Although, if we are running everything together in parallel, I guess it's not so important anymore to run the extension/skin browser tests ahead of the other ones.

Although PHPUnit integration tests and QUnit are centrally set in mediawiki/core, webdriver.io tests are split in each repositories. That is the model also used for to the linters (which are run via composer test and npm test) and let us use different versions of webdriver.io. It is too challenging if not impossible to force migrate all repositories at the same time.

From the list of attached patches there are a few repositories for which tests are broken when run concurrently (see patches associated to this task: bug:T226869 is:open. I imagine if we ran tests from different extensions in parallel we would encounter even more issues.

@zeljkofilipin I was wondering if we could consider using a single wdio.conf.js in mediawiki core that each extension references.

Sorry, looks like I didn't reply to your question. ๐Ÿคฆโ€โ™‚๏ธ

As far as I remember, that's how we started with the webdriver tests. All configuration and dependencies were in core. That simplified some things (like having to update configuration in just one place, core) but vastly complicated some other things (like making a breaking change in core causing a lot of tests in a lot of repositories to fail). I forgot all the details, but I think the consensus was that having to do more work (like updating configuration/dependencies in each repository) was much better long-term than having everything centralized in core.

@hashar and @Krinkle might remember more details.

Change 751957 abandoned by Kosta Harlan:

[mediawiki/extensions/GrowthExperiments@master] selenium: Run test suites concurrently

Reason:

This is not going to be easy to implement, and not worthwhile at the moment given the relatively limited number of tests.

https://gerrit.wikimedia.org/r/751957

Change 934329 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/VisualEditor@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934329

Change 934331 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/Echo@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934331

Change 934333 had a related patch set uploaded (by WMDE-Fisch; author: WMDE-Fisch):

[mediawiki/extensions/CheckUser@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934333

Change 934333 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934333

Change 934331 merged by jenkins-bot:

[mediawiki/extensions/Echo@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934331

Change 934329 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/934329

Change 935162 had a related patch set uploaded (by Dreamy Jazz; author: WMDE-Fisch):

[mediawiki/extensions/CheckUser@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/935162

When crafting a small change to Selenium test https://gerrit.wikimedia.org/r/c/mediawiki/core/+/842391 the specs/recentchanges.js fail. It grabs the list of titles from recent change, pick the first title and fail cause another test created some other page so the expected title is no more the first in the entry. When looking at the captured output, I see the browser window creating pages with the title prefix BeforeEach-name-, however that does not come from spec/recentchanges.js but from specs/page.js!

In MediaWiki core I have set maxInstances: 4, what I suspect is that all four browser share the same X display and the video capture for a given test ends up having frame from another test running concurrently! I have no idea how to fix that though :)

Edit: I have filed the above as T344754: Browser tests video capture is shared between tests

Change 935162 merged by jenkins-bot:

[mediawiki/extensions/CheckUser@master] selenium: run tests concurrently

https://gerrit.wikimedia.org/r/935162