Page MenuHomePhabricator

Auto retry failed browser tests to reduce false negatives
Closed, DuplicatePublic

Description

Many browser tests fail for reasons other than an issue with the code or the test environment: a flaky timeout, a momentary network glitch, etc. etc.

It would be convenient to build into the command that calls Cucumber to run the tests the ability to:

  • track which test is being run
  • evaluate the return status of each test, either per Feature or possibly even per Secenario
  • if the return value for the individual test is non-zero, do not record the test result. Instead run the single test with the single failure.
  • if the second run of the single test returns a non-zero result, record that result as usual
  • if the test passes on the second run, record that result and continue running the tests in the build.

Some years ago I wrote a harness like this in Ruby. If it is not convenient to do this directly in a shell script, it might be possible to use Ruby and shell out for the appropriate commands.


A low hanging fruit was T98968 or retry the cucumber test on Net::ReadTimeout

Details

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:23 AM
bzimport set Reference to bz65773.
bzimport added a subscriber: Unknown Object (MLST).

Pushing this to at least "normal", though "high" would be preferred. :-)

greg renamed this task from investigate re-trying each failed test within a build to Re-try each failed test within a build.Mar 27 2015, 6:37 PM
greg removed a subscriber: Cmcmahon.
Krinkle renamed this task from Re-try each failed test within a build to Auto retry failed browser tests to reduce false negatives.Mar 29 2015, 7:49 AM
Krinkle removed a subscriber: Unknown Object (MLST).

As in T94212: Accommodate flaky tests flapping, I vote -1. Tests should be either made stable or deleted.

Based on recent and growing concerns over labs instability degrading the overall trust in test suites (and browser testing in general), I'd like to reexamine this. Even with isolated CI instances and pre-merge smoke tests, we're always going to have some level of full system integration testing and it's better that it throw up as few false positives as possible.

There are methods of doing this via Cucumber's rerun formatter that don't look too difficult. We could also look at ways of further limiting the retry to only execute if all failures match exceptions that indicate very general or transient failure (Net::ReadTimeout and such).

Change 230715 had a related patch set uploaded (by Dduvall):
Rerun failed browsertests once

https://gerrit.wikimedia.org/r/230715

Change 230715 abandoned by Hashar:
Rerun failed browsertests once

Reason:
Not much happened since August. Feel free to reopen if there is interest in driving it forward.

https://gerrit.wikimedia.org/r/230715

zeljkofilipin lowered the priority of this task from Medium to Low.May 29 2017, 10:29 AM