Page MenuHomePhabricator

Can't install R package Boom (& bsts) on stat1002 (but can on stat1003)
Closed, ResolvedPublic13 Estimated Story Points

Description

Howdy! I've been trying to install the R package Boom (and other packages that depend on it, like bsts) on stat1002, but I've been running into problems at the step where the package's C++ code needs to be compiled. I've documented the problem and my troubleshooting steps at https://gist.github.com/bearloga/98fb72b57c71477c6b13b395e4c0d9ea This package installed everywhere else, including stat1003 (without any modifications) :\ I need these packages for statistical analysis and forecasting directly on stat1002.

Erik B. pointed out that there's a file (/usr/share/R/debian/r-cran.mk) that is on stat1002 but not stat1003. But the author of that file (Dirk) told me that "r-cran.mk is a red herring; not used here -- only used for building r-cran-*.deb package for the distro."

I was recommended 2 solutions:

(If the file I linked above is the correct one, could we please also add r-base-dev and r-recommended to the list?)

Event Timeline

r-base and r-cran-mysql are installed by puppet via class statistics::compute on stat1003

86 ensure_packages([
..
89 'r-base',
90 'r-cran-rmysql',

and installing it on stat1002 should probably be a puppet change.

debt triaged this task as Medium priority.Oct 11 2016, 8:13 PM

I haven't fully groked the description of this problem, but if the solution is what @Dzahn mentioned, I'm totally fine with that!

Hey @Gehel - can you do a bit of pair programming with @mpopov on this? We're thinking that he can do the actual change, you can verify he's done it well and then @Ottomata can plus 2 it.

Thoughts? :)

Change 315885 had a related patch set uploaded (by Bearloga):
Update R and C -related stats puppet configs

https://gerrit.wikimedia.org/r/315885

@Gehel: I added you as a reviewer to that puppet patch (https://gerrit.wikimedia.org/r/#/c/315885/).

How would we add https://launchpad.net/~marutter/+archive/ubuntu/c2d4u as a repo? I tried looking at operations/puppet/modules/apt/manifests/repository.pp to figure out but couldn't. It's a super useful repo of pre-built R package binaries for Ubuntu (see solution 2 in task description), but it is an untrusted PPA, so maybe we're not allowed to use it?

@mpopov: as I understand it, we are not very keen on adding PPA to our apt repo. But Debian packaging is not an area where I am comfortable at all.

I'm also really bad at anything that looks like C. I'll try looking around...

Thanks for trying to look at repository.pp! reprepro is confusing as hell!

@mpopov: Can you provide the command you used to install boom which failed on stat1002, but worked on stat1003? I'd like to debug this a little further. (Also to rule out that some file in your /home directory might be causing this)

We generally don't use external PPAs, since they for all practical purpose allow arbitrary code execution on our hosts by the people controlling the PPA (every package installation can e.g. run postinst scripts which run at root).

If the build works fine on stat1003 I doubt G++ 4.9 might be needed. Updating G++ is also not trivial since it also provides libstdc++, so the upgrade affects other packages as well.

@mpopov: Can you provide the command you used to install boom which failed on stat1002, but worked on stat1003? I'd like to debug this a little further. (Also to rule out that some file in your /home directory might be causing this)

# On stat1002:
$> R
R> Sys.setenv("http_proxy" = "http://webproxy.eqiad.wmnet:8080"); Sys.setenv("https_proxy" = "http://webproxy.eqiad.wmnet:8080")
R> install.packages("Boom", repos = "https://cran.rstudio.com/")

If this is your first R package installation, R will tell you that /usr/lib/R or whatever isn't writeable (because it's not run as sudo) and ask if you want to create a personal library in your home dir, say yes. Then it will proceed to try to install the package.

If that succeeds, see if you can install.packages("bsts", repos = "https://cran.rstudio.com/")

If the build works fine on stat1003 I doubt G++ 4.9 might be needed. Updating G++ is also not trivial since it also provides libstdc++, so the upgrade affects other packages as well.

G++ 4.9 is apparently not available for Ubuntu 14.04, so nevermind on that! :)

Dzahn claimed this task.

The gerrit patch has been amended to use .g++-4.8,, lgtm now. @Gehel @Ottomata want to merge ?

The status change wasn't intended, sorry.

Dzahn removed Dzahn as the assignee of this task.Oct 19 2016, 2:07 AM

Change 315885 merged by Ottomata:
Update R and C -related stats puppet configs

https://gerrit.wikimedia.org/r/315885

Notice: /Stage[main]/Statistics::Compute/Package[libgsl0-dev]/ensure: ensure changed 'purged' to 'present'
Notice: /Stage[main]/Statistics::Compute/Package[gsl-bin]/ensure: ensure changed 'purged' to 'present'
Notice: /Stage[main]/Statistics::Compute/Package[r-base-dev]/ensure: ensure changed 'purged' to 'present'

That looks like the changes have been deployed? Hm... Still can't install Boom on stat1002. @chelsyx just confirmed that she can't install it either. @MoritzMuehlenhoff, have you had a chance to follow the installation instructions and give it a try?

stat1002 and 1003 are supposed to have nearly identical configuration, right? Or no? Also, if the stat* machines inherit from that compute.pp, then why doesn't stat1004 have R installed?

That looks like the changes have been deployed? Hm... Still can't install Boom on stat1002. @chelsyx just confirmed that she can't install it either. @MoritzMuehlenhoff, have you had a chance to follow the installation instructions and give it a try?

stat1002 and 1003 are supposed to have nearly identical configuration, right? Or no? Also, if the stat* machines inherit from that compute.pp, then why doesn't stat1004 have R installed?

Yes, I cannot install Boom on stat1002, but can on stat1003. Please let me know if you want my log or anything else. :)

stat1002 and 1003 are supposed to have nearly identical configuration, right? Or no? Also, if the stat* machines inherit from that compute.pp, then why doesn't stat1004 have R installed?

stat1002 and stat1003 are pretty close, as they are both 'compute' stat nodes (i.e. they both include the statistics::role). stat1004 is not. It was created for the purpose of having a less busy place from which to work with the Hadoop cluster. The storage capacity of stat1004 is much less than stat1002 and stat1003. While you can do 'compute' type stuff on stat1004, usually it is used just for Hadoop clients.

We can also install R on stat1004, if you need it.

Looking at stat1002. Hm. I get this error when I try to run your commands to install Boom above:

I/usr/share/R/include -I. -I../inst/include -IBmath -Imath/cephes -DNO_BOOST_THREADS -DNO_BOOST_FILESYSTEM -DADD_ -DRLANGUAGE  -I"/home/otto/R/x86_64-pc-linux-gnu-library/3.2/BH/include"      -c Models/Bart/Bart.cpp -o Models/Bart/Bart.o
/bin/bash: I/usr/share/R/include: No such file or directory
make: [Models/Bart/Bart.o] Error 127 (ignored)

First glance, that looks like a problem with the Makefiles? I/usr/share/R/include looks like there's a missing - in front of the -I, so it is interpreting I/usr/share/R/include as a full path, and not finding it.

On stat1003, the same commands get:

g++ -std=c++11 -I/usr/share/R/include -DNDEBUG -I. -I../inst/include -IBmath -Imath/cephes -DNO_BOOST_THREADS -DNO_BOOST_FILESYSTEM -DADD_ -DRLANGUAGE  -I"/home/otto/R/x86_64-pc-linux-gnu-library/3.2/BH/include"   -fpic  -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g -c Models/Bart/Bart.cpp -o Models/Bart/Bart.o

Seems to me the install process is leaving of the g++ part of the command! Hm. Pretty confused, still looking...

Oh, just re-read earlier parts of this ticket. Hm. There are .deb packages for Boom, etc.? We can import those .debs to our own apt repo, and then install.

I have no idea why the R install + compilation isn't working on stat1002. Do you need this to work now, or would having the .debs installed be all you need? I don't have a lot of time to debug the installation on stat1002, and installing the .debs should be easier.

Oh, just re-read earlier parts of this ticket. Hm. There are .deb packages for Boom, etc.? We can import those .debs to our own apt repo, and then install.

I have no idea why the R install + compilation isn't working on stat1002. Do you need this to work now, or would having the .debs installed be all you need? I don't have a lot of time to debug the installation on stat1002, and installing the .debs should be easier.

If the short term solution is to add/install those .debs -- specifically, I am interested in bsts (https://launchpad.net/~marutter/+archive/ubuntu/c2d4u/+sourcepub/6821297/+listing-archive-extra), which depends on Boom, etc. -- then I'm a-OK with that because I'd really like to start using it in production as soon as possible :D

And then maybe later we can figure out what the underlying, more fundamental problem is for a more long-term, sustainable solution :)

Crap crackers. I just added bts and boom to our apt, but alas, there are more dependencies (I should have checked before adding them). bh is one, and it exists in his ppa, but there isn't an amd64 build.

Hm. Sigh. Adding all of these dependencies is going to be a big pain. Grrrrrr.

No, sorry, I haven't had time yet :/ not sure when I'll get to this. Will ask in analytics standup tomorrow about prioritizing this with other things.

Milimetric set the point value for this task to 13.Oct 27 2016, 4:00 PM
Milimetric moved this task from In Progress to Paused on the Analytics-Kanban board.

Turns out that sorting this is going to take a bit more than we though, we will be able to devote more uninterrupted time in a week when some of our ongoing tasks are more or less done.

YESSHHHHH I think I did it. @mpopov try:

CXX=g++-4.8
CXX1X=g++-4.8
CXX1XFLAGS=-std=c++11 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g
CXX1XPICFLAGS=-fPIC
SHLIB_CXX1XLD=g++-4.8
SHLIB_CXX1XLDFLAGS=-std=c++11 -shared
LDFLAGS=-L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro

as your ~/.R/Makevars.

No idea why stat1002 would not have these defaults picked, but setting them manually worked for me. (Probably some of those are unnecessary, but that combo did it for me, and it takes too long to also troubleshoot which ones are extraneous :) )

mpopov moved this task from In Code Review to Done on the Analytics-Kanban board.

YESSHHHHH I think I did it. @mpopov try:

CXX=g++-4.8
CXX1X=g++-4.8
CXX1XFLAGS=-std=c++11 -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g
CXX1XPICFLAGS=-fPIC
SHLIB_CXX1XLD=g++-4.8
SHLIB_CXX1XLDFLAGS=-std=c++11 -shared
LDFLAGS=-L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro

as your ~/.R/Makevars.

No idea why stat1002 would not have these defaults picked, but setting them manually worked for me. (Probably some of those are unnecessary, but that combo did it for me, and it takes too long to also troubleshoot which ones are extraneous :) )

@Ottomata You are awesome! Thank you! Those settings worked for me! Installed and tested BSTS out and everything worked as it should. Thank you so much! Can't wait to start using BSTS on stat1002 now :D