Page MenuHomePhabricator

R execution on stat1005 -> 'stack smashing error'
Closed, ResolvedPublic

Description

Wikistats uses R to generate charts
After migration to stat1005 it runs into an error I have never seen on stat1002

  • stack smashing detected ***: /usr/lib/R/bin/exec/R terminated
  • stack smashing detected ***: /usr/lib/R/bin/exec/R terminated
  • stack smashing detected ***: /usr/lib/R/bin/exec/R terminated
Backtrace:

/lib/x86_64-linux-gnu/libc.so.6(+0x70bcb)[0x7fb2efaedbcb]
/lib/x86_64-linux-gnu/libc.so.6(fortify_fail+0x37)[0x7fb2efb76227]
/lib/x86_64-linux-gnu/libc.so.6(
fortify_fail+0x0)[0x7fb2efb761f0]
/lib/x86_64-linux-gnu/libz.so.1(inflate+0x127)[0x7fb2ee4b0007]
/usr/lib/R/lib/libR.so(+0xb3d46)[0x7fb2f0319d46]
/usr/lib/R/lib/libR.so(+0xb4646)[0x7fb2f031a646]
/usr/lib/R/lib/libR.so(+0x1850a3)[0x7fb2f03eb0a3]
/lib/x86_64-linux-gnu/libc.so.6(+0x11038)[0x7fb2efa8e038]

Memory map:
  • stack smashing detected ***: /usr/lib/R/bin/exec/R terminated

Event Timeline

@Erik_Zachte
This is probably because of the new debian stretch stat1005 is running on.

Do you know if this is still happening?

fdans moved this task from Wikistats to Incoming on the Analytics board.
fdans moved this task from Incoming to Backlog (Later) on the Analytics board.
Erik_Zachte raised the priority of this task from Low to Medium.Nov 19 2017, 1:34 PM
Erik_Zachte added a subscriber: fdans.

@fdans Yes it's still happening.

So summary charts (per wiki and per project, e.g. https://stats.wikimedia.org/EN/ProjectTrendsTotalEdits.html (and thousands more) can't be updated.

@ezachte, this looks to be some deeper R + stretch upgrade bug. I think it will be very difficult to solve.

Q: to unblock your charts, what do you need to generate the charts? We just installed R on stat1004, which is still Jessie. Perhaps your stuff will run there?

Hmm, I could migrate all of Wikistats to stat1004 (prefer to keep all one machine, also charts are part of overall Wikistats job.
Is stat1004 machine equivalent to stat1005?
Does 'still Jessie' imply that stat1004 will be upgraded at some point and the same issue will reoccur?

Hmm, I could migrate all of Wikistats to stat1004
Is stat1004 machine equivalent to stat1005?

No, it's not. Hm. stat1004 exists more of as a place to connect to Hadoop and other things, not for large local computation jobs. stat1005 and stat1006 are for that.

Does 'still Jessie' imply that stat1004 will be upgraded at some point and the same issue will reoccur?

Probably, yes.

@mpopov, you use R on stat1005, yes? Have you ever had this problem?

@ezachte, this will be hard for us to reproduce and figure out. Can you try and make an isolated R script that reproduces this error? Then we can use that to troubleshoot.

@mpopov, you use R on stat1005, yes? Have you ever had this problem?

Nope, I've seen some weird errors in general but haven't seen this one. Also I haven't had any issues with R on stat1005 so far.

@Erik_Zachte can you please also post the output of sessionInfo()? Please run this right before the line(s) that crash R.

Might seem like a silly question and I hope this doesn't come off as offensive, but did you re-install all the R packages you were using before? Because even though your R package library was copied over from stat1002 and R will technically see those packages as available, they all need to be re-installed because it's a new machine. Copying libraries only works when it's the same OS & configuration. The packages that include C/C++ code especially require re-compilation.

This script might help: https://gist.github.com/bearloga/7c9078b493e7afb0ca46d5d16bd1aba4

# WMF only:
if (file.exists("/etc/wikimedia-cluster")) {
  message('Detected that this script is being run on a WMF machine ("', Sys.info()["nodename"], '"). Setting proxies...')
  Sys.setenv("http_proxy" = "http://webproxy.eqiad.wmnet:8080")
  Sys.setenv("https_proxy" = "http://webproxy.eqiad.wmnet:8080")
}

# General use:
message("Checking for a personal library...")
if (!dir.exists(Sys.getenv("R_LIBS_USER"))) {
  warning("Personal library not found, creating one...")
  dir.create(Sys.getenv("R_LIBS_USER"), recursive = TRUE)
  message("Registering newly created personal library...")
  .libPaths(Sys.getenv("R_LIBS_USER"))
} else {
  message("Personal library found.")
}

message("Installing devtools for re-installing from GitHub, etc....")
install.packages("devtools", repos = "https://cran.rstudio.com/")
# ^ also ensures that we have a personal library for this version of R

if (! "devtools" %in% installed.packages()[, "Package"]) {
  stop("devtools is/was not installed!")
}

# First, we locate the various personal R libraries the user has:
if (Sys.info()["sysname"] == "Linux") {
  pkg_paths <- dir(paste0("~/R/", R.version$platform, "-library"), full.names = TRUE)
} else if (Sys.info()["sysname"] == "Darwin") {
  pkg_paths <- file.path(dir("~/Library/R", full.names = TRUE), "library")
}

# Pick the latest one that is not the current R version's library:
pkg_path <- tail(setdiff(pkg_paths, .libPaths()), 1)
if (length(pkg_path) == 0) {
  stop("Nothing to do.")
} else {
  message("Found library from previous R installation: ", pkg_path)
}

# List of installed packages in the user's personal library:
installed_pkgs <- dir(pkg_path)
if (length(installed_pkgs) == 0) {
  stop("Did not find any packages from previous R installation.")
}
message("Found ", length(installed_pkgs), " packages from previous R installation.")

# A helper function for extracting a parameter
# from an installed package's DESCRIPTION file
extract_param <- function(DESCRIPTION, param) {
  return(sub(paste0(param, ": "), "", DESCRIPTION[grepl(param, DESCRIPTION)], fixed = TRUE))
}

message("Extracting metadata about packages from previous R installation...")
pkgs_info <- do.call(rbind, lapply(installed_pkgs, function(installed_pkg) {
  # message('Checking how "', installed_pkg, '" was installed...')
  pkg_description <- readLines(file.path(find.package(installed_pkg, lib.loc = pkg_path), "DESCRIPTION"))
  if (any(grepl("Repository: CRAN", pkg_description))) {
    # message('"', installed_pkg, '" was installed from CRAN.')
    pkg_source <- "cran"; pkg_url <- NA
  } else if (any(grepl("RemoteType", pkg_description))) {
    # message('"', installed_pkg, '" was installed from a remote source like GitHub.')
    pkg_source <- extract_param(pkg_description, "RemoteType")
    if (pkg_source == "github") {
      pkg_url <- paste(extract_param(pkg_description, "GithubUsername"), extract_param(pkg_description, "GithubRepo"), sep = "/")
    } else if (pkg_source == "git") {
      pkg_url <- extract_param(pkg_description, "RemoteUrl")
    } else {
      pkg_source <- "other remote"; pkg_url <- NA
    }
  } else {
    # message('"', installed_pkg, '" was installed from a local source.')
    pkg_source <- "local"; pkg_url <- NA
  }
  return(data.frame(
    package = installed_pkg,
    source = pkg_source,
    url = pkg_url,
    stringsAsFactors = FALSE
  ))
}))

 # Re-install packages:
if (sum(pkgs_info$source == "cran") > 0) {
  message("Re-installing ", sum(pkgs_info$source == "cran"), " R packages from CRAN...")
  install.packages(pkgs_info$package[pkgs_info$source == "cran"], repos = "https://cran.rstudio.com/")
}
if (sum(pkgs_info$source == "github") > 0) {
  message("The following ", sum(pkgs_info$source == "github"), " packages will be re-installed from GitHub: ", paste(pkgs_info$package[pkgs_info$source == "github"], collapse = ", "))
  devtools::install_github(pkgs_info$url[pkgs_info$source == "github"])
}
if (sum(pkgs_info$source == "git") > 0) {
  message("The following ", sum(pkgs_info$source == "git"), " packages will be re-installed from Git repos: ", paste(pkgs_info$package[pkgs_info$source == "git"], collapse = ", "))
  devtools::install_git(pkgs_info$url[pkgs_info$source == "git"])
}
if (sum(pkgs_info$source %in% c("other remote", "local")) > 0) {
  message("The following ", sum(pkgs_info$source %in% c("other remote", "local")), " packages will need to be manually re-installed (sorry): ", paste(pkgs_info$package[pkgs_info$source %in% c("other remote", "local")], collapse = ", "))
}

@mpopov Thanks, I totally rely on Andrew for this, I don't have root access, which is fine to me, so I can't mess up ;-) And server migrations are rare anyway.

I'll make a separate bash job and we'll take it from there (later this week).

I see recent R charts again! It was an elusive bug, hard to replicate.

When I ran from the command line all went well.
Also from a bash file with same command repeated 100 times.
Also with command executed in a loop from perl oneliner.

Every time I ran the full perl job with 100's of R invocations, different invocations (for different wikis) caused the stack errors.

I played with doubling stack memory via ulimit -s , and doubling it again, no effect.
But when I doubled overall memory with ulimit -v to 2000000 that helped.
Still ulimit -v 1000000 worked well on stat1002.

Marking this as resolved for now, fingers crossed. Thanks @mpopov @Ottomata