Page MenuHomePhabricator

Discovery Dashboards down due to Vagrant not booting up again
Closed, ResolvedPublic

Description

After the recurrent issue of shiny-server escaping the container out into the VM, I tried restarting Vagrant (to no success), restarting the VM + restarting Vagrant (also to no success). It just hangs on "waiting for machine to boot..." until timing out. In the meantime, we can use http://discovery-beta.wmflabs.org/ for backup. I reached out on Cloud-Services on IRC and @bd808 started looking into this. The repair process is being logged at https://wikitech.wikimedia.org/wiki/Nova_Resource:Shiny-r/SAL

P.S. from IRC: bd808: the error it gave last time was the same as this -- https://github.com/lxc/lxc/issues/345

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

The Vagrant managed LXC container is starting now. The instance had a half installed Linux kernel package that kept the running kernel from matching the LXC and cgroups packages that had been installed. I found this while running sudo puppet agent --test --verbose manually after a couple attempts at manually starting the LXC container.

Here are some of the fun commands for debugging:

$ sudo lxc-ls --fancy
NAME                                    STATE    IPV4        IPV6  AUTOSTART
----------------------------------------------------------------------------
dashboards_default_1440608957973_11431  RUNNING  10.0.3.166  -     NO
$ sudo lxc-start -n dashboards_default_1440608957973_11431 -L /tmp/dashboards.console -o /tmp/dashboards.log -l INFO -d
$ sudo tail -f /tmp/dashboards.*