Page MenuHomePhabricator

mwgate-node10-docker takes much time for Newsletter to be run
Closed, ResolvedPublic

Description

In https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Newsletter/+/446642/ mwgate-node10-docker takes much time for testing, and it fails after 30 minutes because of timeout.

I made empty patch https://gerrit.wikimedia.org/r/#/c/mediawiki/extensions/Newsletter/+/555762/ because I wanted to try to reproduce it, and I was able to reproduce and on empty patch.

This isn't good, needs to be checked.

00:00:12.312 + cd /src
00:00:12.312 + '[' '!' -f package.json ']'
00:00:12.312 + node --version
00:00:12.333 v10.15.2
00:00:12.334 + npm --version
00:00:12.850 6.5.0
00:00:12.854 + '[' -e npm-shrinkwrap.json ']'
00:00:12.854 + '[' -e package-lock.json ']'
00:00:12.854 + npm ci
00:30:00.015 Build timed out (after 30 minutes). Marking the build as failed.
00:30:00.022 Build was aborted

Event Timeline

Also happens on Wikibase and WikibaseLexeme repositories. I have rebuild a random build and looked at the list of processes:

25458 ?        Ss     0:00          \_ /dev/init -- /run.sh
25506 ?        S      0:00              \_ bash /run.sh
25521 ?        Sl     0:02                  \_ npm
25538 ?        Sl     0:03                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25539 ?        Sl     0:02                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25540 ?        Sl     0:02                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25541 ?        Sl     0:02                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25542 ?        Sl     0:03                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25543 ?        Sl     0:04                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25555 ?        Sl     0:02                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci
25561 ?        Sl     0:03                      \_ /usr/bin/node /srv/npm/node_modules/worker-farm/lib/child/index.js /usr/bin/node /usr/bin/npm ci

Killed those processes and eventually npm just fail ..

Mentioned in SAL (#wikimedia-releng) [2019-12-09T10:22:24Z] <hashar> castor: nuked cache /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/mwgate-node10-docker/ # T240174

hashar claimed this task.

I have nuked the npm cache. I guess something got confusing npm ci. Should be good now.