Yet another webgrid instance half-dead. can't ssh in, many webservice in sge deleting state, some in T state.
Description
Description
| Status | Subtype | Assigned | Task | ||
|---|---|---|---|---|---|
| Resolved | None | T124133 NFS overload is causing instances to freeze | |||
| Resolved | scfc | T124162 tools-webgrid-lighttpd-1209 frozen |
Event Timeline
Comment Actions
Contrary to the other instances, the console log on wikitech shows no indications of a problem:
[…] cloud-init boot finished at Wed, 30 Dec 2015 03:17:15 +0000. Up 11.03 seconds Ubuntu 12.04.5 LTS tools-webgrid-lighttpd-1209 ttyS0 tools-webgrid-lighttpd-1209 login:
I'll reboot the instance.
Comment Actions
@scfc did you ever end up rebooting this? It was frozen when I saw it this morning (erroneous time past removed :) I ended up rebooting it after some general info grabbing but top said top - 13:36:26 up 21 days, 10:19, at that time. I'm going to surface a bit more of what I saw in the main ticket.
Comment Actions
@chasemp: Yes, I rebooted it via Special:NovaInstance a short time after 14:38Z (and before 15:43Z), and the instance became responsive afterwards.