Page MenuHomePhabricator

Update R on the statboxes
Closed, ResolvedPublic

Description

The current version of R on our number crunchers (stat100*) is

R version 3.3.3 (2017-03-06) -- "Another Canoe"

The CRAN version is R version 3.5.2 (Eggshell Igloo) and has been released on 2018-12-20.

Could we please have R upgraded on our statboxes?

Rationale: it's not that I really like to have the most recent version at my availability; I have already started giving up on some packages (examples: parallelDist, poweRlaw) because they are not available for R, say, < 3.4.0, anymore.

Please don't forget that we also need a new version of R installed across the workers in the WMF Analytics Cluster for people who would like to do SparkR or sparklyr.

Event Timeline

fdans moved this task from Incoming to Operational Excellence on the Analytics board.

The update of R follows the Debian OS upgrade process sadly, since it is a big burden for us to package a more up to date stack. Having said that, we are in the process of moving to Debian Buster :)

stat1005 is the first of the stat boxes with the new OS:

elukey@stat1005:~$ R

R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"

The Hadoop workers will follow during the next months (more info in T231067).

Ottomata claimed this task.

FYI, anaconda + conda envs are installed on all stat boxes. This should have a more up to date R, as well as allow you to install whatever R conda packages you like. :)

See: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda and https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#Newpyter