Auto retry failed browser tests to reduce false negatives
Closed, DuplicatePublic
Actions

Assigned To

None

Authored By

	• Cmcmahon
	May 26 2014, 3:03 PM

Description

Many browser tests fail for reasons other than an issue with the code or the test environment: a flaky timeout, a momentary network glitch, etc. etc.

It would be convenient to build into the command that calls Cucumber to run the tests the ability to:

track which test is being run
evaluate the return status of each test, either per Feature or possibly even per Secenario
if the return value for the individual test is non-zero, do not record the test result. Instead run the single test with the single failure.
if the second run of the single test returns a non-zero result, record that result as usual
if the test passes on the second run, record that result and continue running the tests in the build.

Some years ago I wrote a harness like this in Ruby. If it is not convenient to do this directly in a shell script, it might be possible to use Ruby and shell out for the appropriate commands.

A low hanging fruit was T98968 or retry the cucumber test on Net::ReadTimeout

Details

Reference: bz65773

	Subject	Repo	Branch	Lines +/-
	Rerun failed browsertests once	integration/config	master	+21 -10

Customize query in gerrit

Related Objects

Mentioned In: T225248: Consider moving browser based tests (Selenium and QUnit) to a non-voting pipeline
T74722: Jenkins tests shouldn't go red when it's not its fault
T94212: Accommodate flaky tests flapping
Mentioned Here: T152963: Increase in failures caused by Saucelabs
T98968: Net::ReadTimeout shouldn't mark a test as failed
T94212: Accommodate flaky tests flapping

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:23 AM

• bzimport added a project: Quality-Assurance.

• bzimport set Reference to bz65773.

• bzimport added a subscriber: Unknown Object (MLST).

• Cmcmahon created this task.May 26 2014, 3:03 PM

Pushing this to at least "normal", though "high" would be preferred. :-)

zeljkofilipin unsubscribed.Dec 3 2014, 12:44 PM

zeljkofilipin edited projects, added Browser-Tests-Infrastructure; removed Quality-Assurance.Mar 25 2015, 1:53 PM

zeljkofilipin set Security to None.

greg renamed this task from investigate re-trying each failed test within a build to Re-try each failed test within a build.Mar 27 2015, 6:37 PM

greg added a project: Continuous-Integration-Infrastructure.

greg removed a subscriber: • Cmcmahon.

greg mentioned this in T94212: Accommodate flaky tests flapping.Mar 27 2015, 6:45 PM

greg mentioned this in T74722: Jenkins tests shouldn't go red when it's not its fault.

Krinkle renamed this task from Re-try each failed test within a build to Auto retry failed browser tests to reduce false negatives.Mar 29 2015, 7:49 AM

Krinkle removed a subscriber: Unknown Object (MLST).

zeljkofilipin moved this task from Inbox to Ruby on the Browser-Tests-Infrastructure board.Mar 30 2015, 12:08 PM

Krinkle moved this task from Untriaged to Backlog on the Continuous-Integration-Infrastructure board.Apr 14 2015, 4:11 PM

As in T94212: Accommodate flaky tests flapping, I vote -1. Tests should be either made stable or deleted.

Based on recent and growing concerns over labs instability degrading the overall trust in test suites (and browser testing in general), I'd like to reexamine this. Even with isolated CI instances and pre-merge smoke tests, we're always going to have some level of full system integration testing and it's better that it throw up as few false positives as possible.

There are methods of doing this via Cucumber's rerun formatter that don't look too difficult. We could also look at ways of further limiting the retry to only execute if all failures match exceptions that indicate very general or transient failure (Net::ReadTimeout and such).

dduvall merged a task: T98968: Net::ReadTimeout shouldn't mark a test as failed.Jun 10 2015, 12:38 AM

dduvall moved this task from Ruby to Next on the Browser-Tests-Infrastructure board.

dduvall added subscribers: Tgr, greg, Aklapper, Jdlrobson.

Jdlrobson awarded a token.Jun 10 2015, 2:54 AM

hashar updated the task description. (Show Details)Jun 10 2015, 7:59 AM

Change 230715 had a related patch set uploaded (by Dduvall):
Rerun failed browsertests once

https://gerrit.wikimedia.org/r/230715

gerritbot added a project: Patch-For-Review.Aug 11 2015, 12:52 AM

dduvall claimed this task.Aug 11 2015, 12:53 AM

dduvall moved this task from Next to In Progress on the Browser-Tests-Infrastructure board.

dduvall moved this task from In Progress to Next on the Browser-Tests-Infrastructure board.Oct 13 2015, 3:51 PM

hashar removed a project: Continuous-Integration-Infrastructure.Oct 28 2015, 2:15 PM

zeljkofilipin moved this task from Next to Ruby on the Browser-Tests-Infrastructure board.Nov 5 2015, 11:46 AM

Change 230715 abandoned by Hashar:
Rerun failed browsertests once

Reason:
Not much happened since August. Feel free to reopen if there is interest in driving it forward.

https://gerrit.wikimedia.org/r/230715

greg added a project: OKR-Work.Jan 13 2016, 8:32 PM

zeljkofilipin awarded a token.Mar 9 2016, 10:21 AM

greg removed dduvall as the assignee of this task.May 16 2017, 3:48 PM

Almost done with this: T152963.

zeljkofilipin added projects: User-zeljkofilipin, Ruby.May 20 2017, 3:36 PM

greg added a project: Release-Engineering-Team (Backlog).May 20 2017, 4:39 PM

zeljkofilipin lowered the priority of this task from Medium to Low.May 29 2017, 10:29 AM

zeljkofilipin closed this task as a duplicate of T152963: Increase in failures caused by Saucelabs.May 29 2017, 11:06 AM

kostajh mentioned this in T225248: Consider moving browser based tests (Selenium and QUnit) to a non-voting pipeline.Jun 7 2019, 2:35 AM

Auto retry failed browser tests to reduce false negativesClosed, DuplicatePublicActions

Description

Details

Related Objects

Event Timeline

Auto retry failed browser tests to reduce false negatives
Closed, DuplicatePublic
Actions