Page MenuHomePhabricator

Wikidata sauce browser tests time out
Closed, ResolvedPublic


In the last week, the wikidata sauce browser test always got aborted due to reaching the 4 hour time out:

Event Timeline


#523Feb 18th 01:22 UTC2 hours 44 min
#524Feb 19th 01:22 UTC1 day 19 hours (stuck by IRC plugin)
#525Feb 20th 20:56 UTC4 hours (killed by timeout)

Build #524 reference a single git change compared to #523:

Maybe that causes the tests to take too long.

I manually hacked the job configuration to checkout other commits. The tests definitely run (used strace on cucumber to confirm it was hitting sauce labs).

A super annoyance is that cucumber emits NO output what soever until it is complete.

Change 275435 had a related patch set uploaded (by Zfilipin):
WIP Fix various problems with browsertests-Wikidata* jobs

The job got stuck. I went on the machine and did a strace -s 11024 -ewrite,read -f -p <PID OF cucumber> and the test suite repeatedly spam:

   "args" : [],
   "script" : "return"

Which indicates that something repeatedly poll the browser until but iterate infinitely when it should dies out after X seconds / iterations.

The scenario was Scenario: Select a property:

Scenario: Select a property                                       # features/reference.feature:71

When I click the statement edit button                          # features/step_definitions/statement_steps.rb:23
And I click the reference add button                            # features/step_definitions/reference_steps.rb:9
And I select the snak property stringprop                       # features/step_definitions/statement_steps.rb:46
<Build was aborted by hashar>

de66fffab3964f1cb7c9af607dc543fa is a Sauce job from browsertests-Wikidata-WikidataTests-linux-chrome-sauce/325 Jenkins job.

Jenkins error message is ERROR Job is not in progress Selenium::WebDriver::Error::WebDriverError) and Sauce error message is: Test exceeded maximum duration after 1800 seconds. That is 30 minutes. For just one scenario.

Looks like the job gets stuck while executing script: "return". It is called from [[ | EntityPage#ajax_wait ]].

The problem is not reproducible while using local browser, but it is reproducible while using Sauce.

That EntityPage#ajax_wait should time out after X iterations at least :)

Example code on how to do timeouts, used in [[;c930ac2779a1877ab2b5fa93ff9a56a2676f652e$203 | VisualEditorPage#visual_editor_element ]], for example:

def ajax_wait
  Timeout.timeout(5) do
    sleep(1.0 / 3) while execute_script('return') != 0
  sleep 1

Change 275435 merged by jenkins-bot:
Disable Raita and enable Cucumber pretty formatter for browsertests-Wikidata* jobs.