Page MenuHomePhabricator

Wikidata sauce browser tests time out
Closed, ResolvedPublic

Description

In the last week, the wikidata sauce browser test always got aborted due to reaching the 4 hour time out:

https://integration.wikimedia.org/ci/job/browsertests-Wikidata-WikidataTests-linux-firefox-sauce/buildTimeTrend

Event Timeline

Restricted Application added subscribers: StudiesWorld, Aklapper. · View Herald TranscriptFeb 25 2016, 12:02 PM

Namely:

BuildStartDuration
#523Feb 18th 01:22 UTC2 hours 44 min
#524Feb 19th 01:22 UTC1 day 19 hours (stuck by IRC plugin)
#525Feb 20th 20:56 UTC4 hours (killed by timeout)

Build #524 reference a single git change compared to #523: https://github.com/wmde/WikidataBrowserTests/commit/3bf7ebf00b0803942d4c5f573e5be8a87683180c

Maybe that causes the tests to take too long.

I manually hacked the job configuration to checkout other commits. The tests definitely run (used strace on cucumber to confirm it was hitting sauce labs).

A super annoyance is that cucumber emits NO output what soever until it is complete.

zeljkofilipin moved this task from Inbox to In Progress on the Browser-Tests-Infrastructure board.
zeljkofilipin added a subscriber: hashar.

Change 275435 had a related patch set uploaded (by Zfilipin):
WIP Fix various problems with browsertests-Wikidata* jobs

https://gerrit.wikimedia.org/r/275435

hashar added a comment.Mar 7 2016, 2:47 PM

The job https://integration.wikimedia.org/ci/view/BrowserTests/view/Wikidata/job/browsertests-Wikidata-WikidataTests-linux-firefox/127/consoleFull got stuck. I went on the machine and did a strace -s 11024 -ewrite,read -f -p <PID OF cucumber> and the test suite repeatedly spam:

{
   "args" : [],
   "script" : "return jQuery.active"
}

Which indicates that something repeatedly poll the browser until jQuery.active but iterate infinitely when it should dies out after X seconds / iterations.

The scenario was Scenario: Select a property:

Scenario: Select a property                                       # features/reference.feature:71

When I click the statement edit button                          # features/step_definitions/statement_steps.rb:23
And I click the reference add button                            # features/step_definitions/reference_steps.rb:9
And I select the snak property stringprop                       # features/step_definitions/statement_steps.rb:46
<Build was aborted by hashar>
zeljkofilipin added a comment.EditedMar 7 2016, 4:42 PM

de66fffab3964f1cb7c9af607dc543fa is a Sauce job from browsertests-Wikidata-WikidataTests-linux-chrome-sauce/325 Jenkins job.

Jenkins error message is ERROR Job is not in progress Selenium::WebDriver::Error::WebDriverError) and Sauce error message is: Test exceeded maximum duration after 1800 seconds. That is 30 minutes. For just one scenario.

Looks like the job gets stuck while executing script: "return jQuery.active". It is called from EntityPage#ajax_wait.

The problem is not reproducible while using local browser, but it is reproducible while using Sauce.

hashar added a comment.Mar 7 2016, 5:29 PM

That EntityPage#ajax_wait should time out after X iterations at least :)

Example code on how to do timeouts, used in VisualEditorPage#visual_editor_element, for example:

def ajax_wait
  Timeout.timeout(5) do
    sleep(1.0 / 3) while execute_script('return jQuery.active') != 0
  end
  sleep 1
  true
end

Change 275435 merged by jenkins-bot:
Disable Raita and enable Cucumber pretty formatter for browsertests-Wikidata* jobs.

https://gerrit.wikimedia.org/r/275435

zeljkofilipin closed this task as Resolved.Mar 10 2016, 1:24 PM