Page MenuHomePhabricator

kartotherian package repo fails to build
Closed, ResolvedPublic

Description

When doing ./server.js build --deploy-repo --force I get this error

npm ERR! Linux 4.9.0-4-amd64
npm ERR! argv "/usr/local/nvm/versions/node/v6.9.1/bin/node" "/usr/local/nvm/versions/node/v6.9.1/bin/npm" "install" "--production" "--build-from-source=mapnik" "--fallback-to-build=false"
npm ERR! node v6.9.1
npm ERR! npm  v3.10.8
npm ERR! path /opt/service/node_modules/.mapnik.DELETE/node_modules/abbrev
npm ERR! code ENOTEMPTY
npm ERR! errno -39
npm ERR! syscall rename

npm ERR! ENOTEMPTY: directory not empty, rename '/opt/service/node_modules/.mapnik.DELETE/node_modules/abbrev' -> '/opt/service/node_modules/mapnik/node_modules/abbrev'
npm ERR!
npm ERR! If you need help, you may report this error at:
npm ERR!     <https://github.com/npm/npm/issues>

npm ERR! Please include the following file with any support request:
npm ERR!     /opt/service/npm-debug.log
ERROR: docker run exited with code 217

Make sure to do docker pull debian:jessie so that you have the latest version and we're building the same thing.

Before the big set of npm dependency trees, the last docker messages were

Step 11/11 : CMD . $NVM_DIR/nvm.sh && nvm use 6.9.1 && npm install --production --build-from-source=mapnik --fallback-to-build=false  && npm install heapdump gc-stats&& if ! [ -e np
m-shrinkwrap.json ] ; then npm dedupe ; fi
 ---> Using cache
 ---> 2a00b4aee5a3
Successfully built 2a00b4aee5a3

This looks to be service-runner related, rather than Kartotherian related.

Edit:

Package repo is 29a680f, deploy repo is 6e223dfb

Event Timeline

Pnorman triaged this task as Normal priority.Dec 14 2017, 1:39 AM
Pnorman created this task.
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptDec 14 2017, 1:39 AM
Pnorman updated the task description. (Show Details)Dec 14 2017, 1:48 AM
Gehel added a subscriber: MaxSem.Dec 14 2017, 8:16 AM

@MaxSem would you have any idea what's going on there?

mobrovac added a subscriber: mobrovac.

This seems to be a permissions problem, oddly enough. I did manage to build it. The current work-around is to run the build script as root. We should look into this problem more carefully to see why that is the case.

Pnorman added a comment.EditedDec 14 2017, 6:00 PM

The current work-around is to run the build script as root.

Within docker? What files do I need to do to change that?

It's also worth noting this used to work.

I've successfully built the deploy repo: https://gerrit.wikimedia.org/r/398315

Most of the time these kind of FS errors are related to how docker mounts the volumes. Which version of OS/Docker do you have?

Pnorman moved this task from Backlog to Stalled/Waiting on the Maps-Sprint board.Dec 14 2017, 8:16 PM
Gehel added a comment.Jan 10 2018, 4:56 PM

I'm on Ubuntu 17.10, with docker 1.5 (the one provided by Ubuntu, fairly old). I can try to install latest. @mobrovac seemed to have reproduced the issue and I expect him to be running something less ancient than me. Which leads me to think that upgrading might not be sufficient.

Gehel added a comment.Jan 10 2018, 5:05 PM

Yep, same issue with docker 17.12.0-ce

Gehel assigned this task to mobrovac.Jan 17 2018, 8:20 PM

I tried to setup a docker on a labs VM to run the build process as root (neither @Pnorman nor me are very keen on launching a big pile of code that we don't understand with a sudo in front). The build still fails with the same error.

@mobrovac: could you find some time to look into this? Or point us to someone who might be able to help?

Pchelolo added a comment.EditedJan 18 2018, 7:30 PM

I've finally managed to reproduce the issue with an Ubuntu VM and the latest version of docker-ce (v17.12). However, I still have no idea what's happening here.

Previously we've seen similar issues with vagrant with certain mounting drivers. I'll check if we're seeing a similar issue here.

My initial theory that this is somehow related to the docker storage driver was incorrect. I've checked overlay, overlay2 and vfs drivers with no results - all end up the same result - build fails.

Another piece of info: after unsuccessfully poking around different ideas I've tried to build it with node 8 and it did build successfully. That indicates that the bug is in a particular version of npm we're using.

Ok, I've come up with a dirty workaround. We need to downgrade npm to version 2 in order to make it work.

To downgrade npm, apply the following patch to service-runner inside the node_modules of the sources repo

diff --git a/lib/docker.js b/lib/docker.js
index fe442c2..495ef4d 100644
--- a/lib/docker.js
+++ b/lib/docker.js
@@ -217,7 +217,7 @@ function createDockerFile() {
     contents += `${envCommand}\n`;
 
     if (opts.deploy) {
-        let beforeInstall = '';
+        let beforeInstall = `${npmCommand} i npm@2 && `;
         let afterInstall = '';
         if (opts.reshrinkwrap) {
             beforeInstall = 'rm -f npm-shrinkwrap.json && ';
@@ -227,7 +227,7 @@ function createDockerFile() {
         if (pkg.deploy.install_opts) {
             installOpts += `${pkg.deploy.install_opts.join(' ')} `;
         }
-        contents += `CMD ${beforeInstall}${npmCommand} install${installOpts}${afterInstall} && npm install heapdump gc-stats`;
+        contents += `CMD ${beforeInstall} ./node_modules/.bin/npm install${installOpts}${afterInstall} && npm install heapdump gc-stats`;
         if (!opts.reshrinkwrap) {
             contents += '&& if ! [ -e npm-shrinkwrap.json ] ; then npm dedupe ; fi';
         }

If that works I will formalize it a bit to allow clients select npm version along with the node version.

Gehel added a comment.Jan 19 2018, 9:11 AM

I just tested the patch, and it works for me as well. My guess is something was failing in the build and leaving a non empty directory, which could then not be removed. But honestly, I don't know what I'm talking about.

@Pchelolo thanks a lot for the help!

Gehel awarded a token.Jan 19 2018, 9:11 AM

My guess is something was failing in the build and leaving a nonempty directory, which could then not be removed. But honestly, I don't know what I'm talking about.

I think that's correct, but that's not a build fault, it's npm bug. We've seen issues like this before.

I will add a more generic support for the workaround in service-runner that will allow us to specify npm version in the deploy section of the package.json

I've created a PR for service-runner that will allow specifying the npm version in the deploy section of the package.json file. At least with that, a hack looks a little bit less hacky. Also, it removes the custom installed npm so that we don't really deploy npm to production.

Gehel moved this task from Stalled/Waiting to Done on the Maps-Sprint board.Jan 22 2018, 10:44 AM

@Pchelolo: thanks for the fix! The PR looks good to me (but again, what do I know...). Ping me when it is merged, and I can verify that it works for maps. In the meantime, we are now truly unblocked on that task and can resume deploying kartotherian (as soon as we have a version to deploy).

Thanks!

Fjalapeno closed this task as Resolved.Feb 1 2018, 8:08 PM