Page MenuHomePhabricator

Docker headless Chrome running is sometimes silently terminating on start(?), making it impossible to merge code in VE
Closed, ResolvedPublic

Description

E.g. on https://gerrit.wikimedia.org/r/c/VisualEditor/VisualEditor/+/447144 -> visualeditor-npm-browser-node-6-docker -> https://integration.wikimedia.org/ci/job/visualeditor-npm-browser-node-6-docker/1413/console :

15:10:26 npm info lifecycle mmmagic@0.4.6~install: mmmagic@0.4.6
15:10:26 
15:10:26 > mmmagic@0.4.6 install /src/node_modules/mmmagic
15:10:26 > node-gyp rebuild
15:10:26 
15:10:26 
15:10:26 
15:10:27 + terminate_bg_process
15:10:27 + set +x
15:10:27 Terminating Xvfb
15:10:27 Done
15:10:27 Terminating Chromedriver
15:10:27 Done
15:10:27 Build step 'Execute shell' marked build as failure

But it sometimes works, e.g. https://integration.wikimedia.org/ci/job/visualeditor-npm-browser-node-6-docker/1412/console

Event Timeline

This is happening far too often. Here it is timing out 4 times in a row:

image.png (165×1 px, 61 KB)

( https://gerrit.wikimedia.org/r/#/c/VisualEditor/VisualEditor/+/314032/4 )

mmmagic failed to compile:

14:57:01 ImportError: No module named compiler.ast
14:57:01 gyp ERR! configure error 
14:57:01 gyp ERR! stack Error: `gyp` failed with exit code: 1
14:57:01 gyp ERR! stack     at ChildProcess.onCpExit (/usr/local/lib/node_modules/npm/node_modules/node-gyp/lib/configure.js:305:16)
14:57:01 gyp ERR! stack     at emitTwo (events.js:106:13)
14:57:01 gyp ERR! stack     at ChildProcess.emit (events.js:191:7)
14:57:01 gyp ERR! stack     at Process.ChildProcess._handle.onexit (internal/child_process.js:215:12)
14:57:01 gyp ERR! System Linux 4.9.0-0.bpo.6-amd64
14:57:01 gyp ERR! command "/usr/bin/nodejs" "/usr/local/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js" "rebuild"
14:57:01 gyp ERR! cwd /src/node_modules/mmmagic
14:57:01 gyp ERR! node -v v6.11.0
14:57:01 gyp ERR! node-gyp -v v3.3.1
14:57:01 gyp ERR! not ok 
14:57:01 npm info lifecycle mmmagic@0.4.6~install: Failed to exec install script
14:57:01 npm WARN install:mmmagic@0.4.6 mmmagic@0.4.6 install: `node-gyp rebuild`
14:57:01 npm WARN install:mmmagic@0.4.6 Exit status 1

So I think that the python import error is a red herring. It looks like Firefox is hanging in ChromeDriver tests:

13:17:22   ve.dm.Node
13:17:22     ✔ canHaveChildren
13:17:22     ✔ canHaveChildrenNotContent
13:17:22     ✔ getLength
13:17:22     ✔ getOuterLength
13:17:22     ✔ setLength
13:17:22     ✔ adjustLength
13:17:22     ✔ getAttribute
13:17:22     ✔ setRoot
13:17:22     ✔ attach
13:17:22     ✔ detach
13:17:22     ✔ canBeMergedWith
13:17:22     ✔ getClonedElement
13:17:30 20 08 2018 13:17:30.913:WARN [web-server]: 404: /null
13:17:53 20 08 2018 13:17:52.941:WARN [Firefox 52.0.0 (Linux 0.0.0)]: Disconnected (1 times), because no message in 30000 ms.
13:17:53 Firefox 52.0.0 (Linux 0.0.0) ERROR
13:17:53   Disconnected, because no message in 30000 ms.
13:17:53 20 08 2018 13:17:53.357:INFO [karma]: Restarting Firefox 52.0.0 (Linux 0.0.0) (1 of 2 attempts)
13:17:57 20 08 2018 13:17:57.721:INFO [Firefox 52.0.0 (Linux 0.0.0)]: Connected on socket BkqKrKnctq7jeFdRAAAC with id 2508817
13:18:00 20 08 2018 13:18:00.050:WARN [web-server]: 404: /null
13:18:27 20 08 2018 13:18:27.728:WARN [Firefox 52.0.0 (Linux 0.0.0)]: Disconnected (2 times), because no message in 30000 ms.
13:18:27 Firefox 52.0.0 (Linux 0.0.0) ERROR
13:18:27   Disconnected, because no message in 30000 ms.
13:18:27 20 08 2018 13:18:27.736:INFO [karma]: Restarting Firefox 52.0.0 (Linux 0.0.0) (2 of 2 attempts)
13:18:31 20 08 2018 13:18:31.326:INFO [Firefox 52.0.0 (Linux 0.0.0)]: Connected on socket mgbjJsyoDN0qYTC8AAAD with id 2508817
13:19:01 20 08 2018 13:19:01.335:WARN [Firefox 52.0.0 (Linux 0.0.0)]: Disconnected (3 times), because no message in 30000 ms.
13:19:01 Firefox 52.0.0 (Linux 0.0.0) ERROR
13:19:01   Disconnected, because no message in 30000 ms.

Yes, I always assumed this 30s test timeout was the issue, but don't know what's causing it.

NB we only enabled FF testing a few months ago as a nice-to-have, but it is only a few edge case tests that could pass in Chrome and fail in FF, so we could probably get away with disabling the FF tests if it makes CI more reliable.

That would probably be wise at least for now, since it seems to be only Firefox which is failing regularly.

Change 454072 had a related patch set uploaded (by Jforrester; owner: Jforrester):
[VisualEditor/VisualEditor@master] build: Temporarily disable qunit testing in Firefox due to CI issues

https://gerrit.wikimedia.org/r/454072

Change 454072 merged by jenkins-bot:
[VisualEditor/VisualEditor@master] build: Temporarily disable qunit testing in Firefox due to CI issues

https://gerrit.wikimedia.org/r/454072

Esanders lowered the priority of this task from High to Medium.Aug 20 2018, 7:25 PM

Lowering priority to reflect this bug is essentially about getting FF testing back online.

Change 454168 had a related patch set uploaded (by Bartosz Dziewoński; owner: Bartosz Dziewoński):
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (510506739)

https://gerrit.wikimedia.org/r/454168

Change 454168 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (510506739)

https://gerrit.wikimedia.org/r/454168

Krinkle claimed this task.
Krinkle subscribed.

Fixed by 4142f9c7f41d, T211784, and other VE changes; We no longer use Xvfb for these jobs.

Change 524816 had a related patch set uploaded (by Divec; owner: Divec):
[VisualEditor/VisualEditor@master] WIP: Re-enable karma Firefox testing

https://gerrit.wikimedia.org/r/524816

Change 524816 merged by jenkins-bot:
[VisualEditor/VisualEditor@master] Re-enable karma Firefox testing

https://gerrit.wikimedia.org/r/524816

Change 525355 had a related patch set uploaded (by DLynch; owner: DLynch):
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (788d449ae)

https://gerrit.wikimedia.org/r/525355

Change 525355 merged by jenkins-bot:
[mediawiki/extensions/VisualEditor@master] Update VE core submodule to master (788d449ae)

https://gerrit.wikimedia.org/r/525355