Page MenuHomePhabricator

Selenium and PHPUnit: Stop execution on failure
Open, NormalPublic

Description

Follow up from T225248, this is a proposal to use PHPUnit's --stop-on-failure and Mocha's bail configuration option to end PHPUnit and Selenium test runs early if there's a failure.

The advantages are:

  • reduced CI resources as we don't need to wait for a 20-30 minute build if we know it's going to fail anyway
  • a little less impact of flaky tests as "recheck" can be started sooner

Event Timeline

kostajh created this task.Jun 15 2019, 7:16 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptJun 15 2019, 7:16 PM

Change 517188 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[integration/quibble@master] Stop PHPUnit execution on failure

https://gerrit.wikimedia.org/r/517188

Change 517189 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/core@master] Stop Selenium execution after first test failure

https://gerrit.wikimedia.org/r/517189

Change 517191 had a related patch set uploaded (by Kosta Harlan; owner: Kosta Harlan):
[mediawiki/core@master] Stop PHPUnit execution after first failure

https://gerrit.wikimedia.org/r/517191

Here's an example of Selenium failing early (note I think we'd need to update wdio.conf.js for extensions too?)

Main test build failed.

   wmf-quibble-core-vendor-mysql-hhvm-docker https://integration.wikimedia.org/ci/job/wmf-quibble-core-vendor-mysql-hhvm-docker/17066/console : FAILURE in 6m 33s
   mwgate-node10-docker https://integration.wikimedia.org/ci/job/mwgate-node10-docker/4436/console : SUCCESS in 53s
   mediawiki-quibble-composer-mysql-php70-docker https://integration.wikimedia.org/ci/job/mediawiki-quibble-composer-mysql-php70-docker/20833/console : FAILURE in 3m 24s
   mediawiki-quibble-vendor-mysql-php70-docker https://integration.wikimedia.org/ci/job/mediawiki-quibble-vendor-mysql-php70-docker/21309/console : FAILURE in 3m 20s
   mediawiki-quibble-composertest-php70-docker https://integration.wikimedia.org/ci/job/mediawiki-quibble-composertest-php70-docker/20782/console : SUCCESS in 1m 54s
   mediawiki-core-jsduck-docker https://integration.wikimedia.org/ci/job/mediawiki-core-jsduck-docker/11036/console : SUCCESS in 18s
   mediawiki-core-php70-phan-docker https://integration.wikimedia.org/ci/job/mediawiki-core-php70-phan-docker/31749/console : SUCCESS in 1m 31s
MaxSem added a subscriber: MaxSem.Jun 16 2019, 5:21 AM

I would actually object to this: imagine your change has caused multiple test failures that you weren't able to predict in your dev environment (because you didn't have all extensions installed or your environment is otherwise different from our CI). You'll have to amend your PR with one fix at a time and push it just to see what explodes next.

I would actually object to this: imagine your change has caused multiple test failures that you weren't able to predict in your dev environment (because you didn't have all extensions installed or your environment is otherwise different from our CI). You'll have to amend your PR with one fix at a time and push it just to see what explodes next.

For this use case, what would you think about a "check all" command which forces a full run of all suites, gated extensions, browser tests, etc., not stopping on failure? We have to optimize for one or the other case, it seems.

Another variation would be that we drop the running job's priority after the first failure, but still let it continue to run. That might be difficult technically and not save us many resources, however.

I would actually object to this: imagine your change has caused multiple test failures that you weren't able to predict in your dev environment (because you didn't have all extensions installed or your environment is otherwise different from our CI).

I don’t doubt that this could be a problem and that it would be really annoying, but I’m wondering if you could share a patch or two with this kind of situation so we could look at a specific example?

And yeah I like the idea of a “check all” command to bypass the stop on failure.

Relatedly, IMO the stop on failure work would be much more worthwhile if a failure in one job in the test pipeline could immediately halt all the other jobs too.

From the point of view that has been commited to the patchs, this will limit the ability to run all tests in my development environment. Is it only possible to change the WMF CI configuration to do this?

Tgr added a subscriber: Tgr.Jun 16 2019, 6:46 PM

I would actually object to this: imagine your change has caused multiple test failures that you weren't able to predict in your dev environment (because you didn't have all extensions installed or your environment is otherwise different from our CI). You'll have to amend your PR with one fix at a time and push it just to see what explodes next.

Yeah, this would make fixing multiple issues very annoying and time-consuming, and would make the life of new developers especially hard (so far we didn't force them set up the tests locally just to have a non-aggravating experience). I don't think it's worth it.

Ideally, the tests would report failure as soon as they find it but continue to run to the end. I doubt that's realistic with Jenkins though. (This post describes a possible way to do it via the cacheResult option, but it would require a much more recent PHPUnit version, and even so it seems very hacky.)

One thing that might be worthwhile is starting with "local" test (core tests in case of core, the extensions' own tests in case of an extension) and only running tests belonging to other repos if the local tests succeed. Since it's almost always the local tests that fail, that would speed things up a lot without forcing too many re-runs on the developer.

Change 517191 abandoned by Kosta Harlan:
Stop PHPUnit execution after first failure

Reason:
Let's handle this in Quibble instead.

https://gerrit.wikimedia.org/r/517191

Change 517189 abandoned by Kosta Harlan:
Stop Selenium execution after first test failure

Reason:
As Awight noted, should be done in Quibble not here.

https://gerrit.wikimedia.org/r/517189

Change 517188 abandoned by Kosta Harlan:
Stop PHPUnit execution on failure

Reason:
We would need consensus on this before pursuing it further.

https://gerrit.wikimedia.org/r/517188