Page MenuHomePhabricator

Can't login to catgraph instance
Closed, ResolvedPublic

Description

When trying to ssh to sylvester I get the following:

$ ssh -A jkroll@sylvester.eqiad.wmflabs 

If you are having access problems, please see:https://labsconsole.wikimedia.org/wiki/Access#Accessing_public_and_private_instances
Permission denied (publickey).

The catgraph service which should be running there is not reachable either (connection refused).

Special:NovaInstance shows the instance as ACTIVE. Tried rebooting it via the web interface, no change.

I can ssh to other instances in the same project normally, and the services are running there as well.

Could this be an NFS problem?

Event Timeline

jkroll raised the priority of this task from to Needs Triage.
jkroll updated the task description. (Show Details)
jkroll added projects: Catgraph, Cloud-VPS.
jkroll subscribed.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Andrew claimed this task.
Andrew subscribed.

Logins should be fixed now. Part of this issue was caused by a broken puppet run -- it's very important that you keep your instances with properly running puppet.

Thanks for the quick response, but I still can't ssh there - same output as before :/

I've removed role::labsnfs::client and webserver::apache and catgraph_hostmap from the puppet config of that instance in order to allow puppet to run properly.

You'll need to restore those classes and rearrange your git branches on the puppet master to get puppet running.

Puppet clearly hadn't run on this instance for many months -- it's a miracle that it's been working this long. Remember, if puppet runs are failing on an instance, that instance may fall off the internet at any moment.

Additionally, I sent an email several weeks ago that specifically described your use case with instructions about how to avoid this specific problem. Please subscribe to labs-l and read the announcements there in order avoid future issues such as this one.