Page MenuHomePhabricator

Firefox CI tests keep failing in VE with Firefox 68
Open, LowPublic

Description

For example the first run here: https://gerrit.wikimedia.org/r/#/c/VisualEditor/VisualEditor/+/558531/ (https://integration.wikimedia.org/ci/job/visualeditor-node10-browser-docker/1196/console)

I've seen this at least 3 times in the past week.

14:32:16 Running "karma:firefox" (karma) task
14:32:16 17 12 2019 14:32:16.097:WARN [watcher]: All files matched by "/src/node_modules/qunit/qunit/qunit.js" were excluded or matched by prior matchers.
14:32:16 17 12 2019 14:32:16.179:INFO [karma-server]: Karma v3.1.3 server started at http://0.0.0.0:9876/
14:32:16 17 12 2019 14:32:16.179:INFO [launcher]: Launching browsers FirefoxHeadless with concurrency unlimited
14:32:16 17 12 2019 14:32:16.182:INFO [launcher]: Starting browser FirefoxHeadless
14:33:16 17 12 2019 14:33:16.186:WARN [launcher]: FirefoxHeadless have not captured in 60000 ms, killing.
14:33:16 17 12 2019 14:33:16.267:INFO [launcher]: Trying to start FirefoxHeadless again (1/2).
14:34:02 17 12 2019 14:34:02.255:INFO [Firefox 68.0.0 (Linux 0.0.0)]: Connected on socket C2YBo2DbLQsFFg3PAAAB with id 94570126
14:34:37 17 12 2019 14:34:37.260:WARN [Firefox 68.0.0 (Linux 0.0.0)]: Disconnected (0 times)reconnect failed before timeout of 5000ms (ping timeout)
14:34:37 Firefox 68.0.0 (Linux 0.0.0) ERROR
14:34:37   Disconnectedreconnect failed before timeout of 5000ms (ping timeout)
14:34:37 Firefox 68.0.0 (Linux 0.0.0): Executed 0 of 0 DISCONNECTED (35.005 secs / 0 secs)
14:34:37 17 12 2019 14:34:37.263:INFO [karma-server]: Restarting Firefox 68.0.0 (Linux 0.0.0) (1 of 2 attempts)
14:35:37 17 12 2019 14:35:37.490:WARN [launcher]: FirefoxHeadless have not captured in 60000 ms, killing.
14:35:37 17 12 2019 14:35:37.620:INFO [launcher]: Trying to start FirefoxHeadless again (2/2).
14:36:37 17 12 2019 14:36:37.626:WARN [launcher]: FirefoxHeadless have not captured in 60000 ms, killing.
14:36:37 17 12 2019 14:36:37.769:ERROR [launcher]: FirefoxHeadless failed 2 times (timeout). Giving up.
14:36:37 Warning: Task "karma:firefox" failed. Use --force to continue.

Happened again when the image got updated by mistake T259925. It seems to be related to Firefox 68.

ImageFirefox package
docker-registry.wikimedia.org/releng/node10-test-browser:0.6.0-s160.8.0esr-1~deb9u1
docker-registry.wikimedia.org/releng/node10-test-browser:0.6.268.11.0esr-1~deb9u1

Event Timeline

Esanders created this task.Dec 17 2019, 3:01 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 17 2019, 3:01 PM
Esanders triaged this task as High priority.Dec 19 2019, 4:19 PM
Esanders added a subscriber: Krinkle.

This is failing quite regularly now.

Change 559545 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[integration/config@master] jjb: Temporarily roll back node10-test-browser uses to 0.6.0-s1

https://gerrit.wikimedia.org/r/559545

Change 559545 merged by jenkins-bot:
[integration/config@master] jjb: Temporarily roll back node10-test-browser uses to 0.6.0-s1

https://gerrit.wikimedia.org/r/559545

OK, I've bumped the jobs back down to the old image (which has Firefox 60 not 68), and all seems to now pass. We'll need to work out what's broken here and fix it before rolling them forward again.

dchan added a subscriber: dchan.Dec 19 2019, 5:14 PM

Ooh, thanks!

thcipriani lowered the priority of this task from High to Medium.Jan 7 2020, 2:47 PM
thcipriani added a subscriber: thcipriani.

OK, I've bumped the jobs back down to the old image (which has Firefox 60 not 68), and all seems to now pass. We'll need to work out what's broken here and fix it before rolling them forward again.

Lowering priority since there is a workaround in place -- feel free to override if I've misunderstood the current situation.

Jdforrester-WMF lowered the priority of this task from Medium to Low.Feb 24 2020, 5:38 PM
JTannerWMF moved this task from To Triage to Triaged on the VisualEditor board.
JTannerWMF added a subscriber: JTannerWMF.

Looks like the Release Engineering team is thinking about this task. If that is incorrect and there is an action for Editing please let me know.

Change 619288 had a related patch set uploaded (by Hashar; owner: Hashar):
[integration/config@master] jjb: rollback node10-test-browser to 0.6.0-s1 [2]

https://gerrit.wikimedia.org/r/619288

Change 619288 merged by jenkins-bot:
[integration/config@master] jjb: rollback node10-test-browser to 0.6.0-s1 [2]

https://gerrit.wikimedia.org/r/619288

hashar renamed this task from Firefox CI tests keep failing in VE to Firefox CI tests keep failing in VE with Firefox 68.Aug 10 2020, 12:35 PM
hashar updated the task description. (Show Details)
hashar added a subscriber: hashar.Aug 10 2020, 1:16 PM

We would need a way to reproduce the issue. I tried but it passes just fine on my machine:

$ docker run --rm -it -v "$(pwd):/src" --entrypoint=/src/node_modules/.bin/grunt docker-registry.wikimedia.org/releng/node10-test-browser:0.6.2 karma:firefox
Running "karma:firefox" (karma) task
10 08 2020 13:10:46.828:WARN [filelist]: All files matched by "/src/node_modules/qunit/qunit/qunit.js" were excluded or matched by prior matchers.
10 08 2020 13:10:46.978:INFO [karma-server]: Karma v5.0.9 server started at http://0.0.0.0:9876/
10 08 2020 13:10:46.979:INFO [launcher]: Launching browsers FirefoxHeadless with concurrency unlimited
10 08 2020 13:10:46.982:INFO [launcher]: Starting browser FirefoxHeadless
10 08 2020 13:10:48.902:INFO [Firefox 68.0 (Linux x86_64)]: Connected on socket 4-nCmjjFxaAvcTwBAAAA with id 28414039
................................................................................
................................................................................
................................................................................
......................................................................
Firefox 68.0 (Linux x86_64): Executed 310 of 310 SUCCESS (13.886 secs / 13.776 secs)

Done.
$
Krinkle added a comment.EditedAug 12 2020, 3:25 AM

This seems to be affecting jquery-client as well (a commit of mine was failing with a timeout, but now passes after a recheck).

https://gerrit.wikimedia.org/r/c/jquery-client/+/619134
https://integration.wikimedia.org/ci/job/generic-node10-browser-docker/1480/console (Build kept indefinitely)

`
… docker-registry.wikimedia.org/releng/node10-test-browser:0.6.3
…
08 08 2020 18:51:13.560:WARN [launcher]: FirefoxHeadless have not captured in 60000 ms, killing.
08 08 2020 18:51:13.712:INFO [launcher]: Trying to start FirefoxHeadless again (1/2).
08 08 2020 18:52:13.716:WARN [launcher]: FirefoxHeadless have not captured in 60000 ms, killing.
08 08 2020 18:52:13.823:INFO [launcher]: Trying to start FirefoxHeadless again (2/2).
08 08 2020 18:53:05.578:INFO [Firefox 68.0 (Linux x86_64)]: Connected on socket 2xwjbntcMwh2Z2-8AAAA with id 79031389
08 08 2020 18:53:37.556:WARN [Firefox 68.0 (Linux x86_64)]: Disconnected (0 times)reconnect failed before timeout of 2000ms (ping timeout)
Firefox 68.0 (Linux x86_64) ERROR

And after: https://integration.wikimedia.org/ci/job/generic-node10-browser-docker/1487/console

… docker-registry.wikimedia.org/releng/node10-test-browser:0.6.0-s1

Firefox 60.0 (Linux x86_64): Executed 5 of 5 SUCCESS (0.051 secs / 0.025 secs)

This is a tiny test and might make for an easier repro case locally. In any event, CI was (usually) passing on jquery-client with the newer Docker image as well for a while, so it's most likely a race condition of some kind.

Upstream:

Upstream upstream: