Page MenuHomePhabricator

can't log in to phab-01.eqiad.wmflabs
Closed, ResolvedPublic

Description

phab-01 and 02 are currently inaccessible via ssh. @yuvipanda did some troubleshooting yesterday and was unable to get either one to allow ssh logins. Rebooting the instances hasn't helped and I have no idea how to troubleshoot it any further.

I'm temporarily using phab-scap.eqiad.wmflabs, however, a lot of configuration and test data is accumulated in phab-01, it'd be a shame to lose it.

As a last resort, assuming it's not a simple fix to get ssh working, is there any way to mount the disk image from phab-01 into another running vm so that I can extract the data?

Event Timeline

mmodell raised the priority of this task from to Needs Triage.
mmodell updated the task description. (Show Details)
mmodell added subscribers: mmodell, yuvipanda.
Restricted Application added subscribers: StudiesWorld, Luke081515, scfc, Aklapper. · View Herald Transcript
Luke081515 moved this task from To Triage to Misc on the Phabricator board.

I logged into phab-02 yesterday (actually I still am in tmux). /tmp is a 1 MiB partition that was full when I logged in with temp files from a mkinitramfs generation (probably from a kernel upgrade) which I cleared out. Now, /tmp is again full with mkinitramfs files.

And from auth.log: User twentyafterfour from bastion-01.bastion.eqiad.wmflabs not allowed because not listed in AllowUsers.

I got this mail:

Puppet is failing to run on the "phab-03" instance in the Wikimedia Labs
project "phabricator"

when running puppet there:

phab-03 is a Puppet client of deploy.phabricator.eqiad.wmflabs (puppetclient)
The last Puppet run was at Fri Feb 5 21:32:35 UTC 2016 (3245 minutes ago).

@yuvipanda how were you able to debug and what did you see? i also cant login as root on that instance.

Over two days I've gotten four email. Two about phab-03 and two about harbormaster1.

I'm trying to make sense of the Grafana disk space available graphs. I don't know specifically how the root's partition is fluctuating but it would be helpful to have all the other partitions available too. (maybe we could get a Grafana admin to add the * to view all drives)

mmodell claimed this task.

@mmodell What was the issue? Where you able to get into phab-02 as well?