Page MenuHomePhabricator

Auto retry failed browser tests to reduce false negatives
Closed, DuplicatePublic

Description

Many browser tests fail for reasons other than an issue with the code or the test environment: a flaky timeout, a momentary network glitch, etc. etc.

It would be convenient to build into the command that calls Cucumber to run the tests the ability to:

  • track which test is being run
  • evaluate the return status of each test, either per Feature or possibly even per Secenario
  • if the return value for the individual test is non-zero, do not record the test result. Instead run the single test with the single failure.
  • if the second run of the single test returns a non-zero result, record that result as usual
  • if the test passes on the second run, record that result and continue running the tests in the build.

Some years ago I wrote a harness like this in Ruby. If it is not convenient to do this directly in a shell script, it might be possible to use Ruby and shell out for the appropriate commands.


A low hanging fruit was T98968 or retry the cucumber test on Net::ReadTimeout

Details

Reference
bz65773

Event Timeline

bzimport raised the priority of this task from to Normal.Nov 22 2014, 3:23 AM
bzimport added a project: Quality-Assurance.
bzimport set Reference to bz65773.
bzimport added a subscriber: Unknown Object (MLST).

Pushing this to at least "normal", though "high" would be preferred. :-)

greg renamed this task from investigate re-trying each failed test within a build to Re-try each failed test within a build.Mar 27 2015, 6:37 PM
greg removed a subscriber: Cmcmahon.
Krinkle renamed this task from Re-try each failed test within a build to Auto retry failed browser tests to reduce false negatives.Mar 29 2015, 7:49 AM
Krinkle removed a subscriber: Unknown Object (MLST).

As in T94212: Accommodate flaky tests flapping, I vote -1. Tests should be either made stable or deleted.

dduvall added a subscriber: dduvall.EditedJun 10 2015, 12:33 AM

Based on recent and growing concerns over labs instability degrading the overall trust in test suites (and browser testing in general), I'd like to reexamine this. Even with isolated CI instances and pre-merge smoke tests, we're always going to have some level of full system integration testing and it's better that it throw up as few false positives as possible.

There are methods of doing this via Cucumber's rerun formatter that don't look too difficult. We could also look at ways of further limiting the retry to only execute if all failures match exceptions that indicate very general or transient failure (Net::ReadTimeout and such).

hashar updated the task description. (Show Details)Jun 10 2015, 7:59 AM

Change 230715 had a related patch set uploaded (by Dduvall):
Rerun failed browsertests once

https://gerrit.wikimedia.org/r/230715

dduvall claimed this task.Aug 11 2015, 12:53 AM
dduvall moved this task from Next to In Progress on the Browser-Tests-Infrastructure board.

Change 230715 abandoned by Hashar:
Rerun failed browsertests once

Reason:
Not much happened since August. Feel free to reopen if there is interest in driving it forward.

https://gerrit.wikimedia.org/r/230715

greg removed dduvall as the assignee of this task.May 16 2017, 3:48 PM

Almost done with this: T152963.

zeljkofilipin lowered the priority of this task from Normal to Low.May 29 2017, 10:29 AM