Page MenuHomePhabricator

petscan5 unresponsive
Closed, ResolvedPublicBUG REPORT

Description

The PetScan machine petscan5 was unresponsive so I went for soft reboot but it seems stuck. I tried to stop it but that won't work either. Can someone reboot it please? Thanks!

Event Timeline

Magnus triaged this task as Unbreak Now! priority.Jan 23 2025, 6:54 PM

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

taavi lowered the priority of this task from Unbreak Now! to Needs Triage.Jan 23 2025, 7:05 PM

UPDATE: The instance has restarted but apparently has no longer the key pair associated. I tried to ssh in from login.toolforge.org and bastion.wmflabs.org, without success.

Mentioned in SAL (#wikimedia-cloud) [2025-01-24T15:37:25Z] <dhinus> openstack server migrate {petscan5_id} T384642

fnegri claimed this task.

The instance is still inaccessible via ssh to me.

@Magnus uh weird, it was working for my user so I assumed that fixed it for everyone. Looking.

My key pair was gone so I just made a new PetScan one, if that helps?

The keys that I would expect to work are the ones you can find listed at https://ldap.toolforge.org/user/magnus

Yes I meant the OpenStack keys

Never mind, got it to work, bastion didn't have my keys for some reason anymore

petscan's "This web service cannot be reached" has continued to come and go (T384464), and is currently at issue (despite that other ticket's being closed). Could it be that the root cause of this issue is remains unresolved @Magnus ? Thank you for your help.

Restarted the VM, seems to work now

Hi, this is still an issue, the tool as of today is not reachable :/

Pruem subscribed.

This is unfortunately not resolved.

taavi subscribed.

As the proxy error message states, you need to report this directly to the maintainers of Petscan and not to this task which is in the Cloud-VPS infrastructure board. My understanding is that Petscan issues are tracked at https://github.com/magnusmanske/petscan_rs/issues.

It appears that this is not a software issue of PetScan, but the frequent need to reboot the VM perhaps has something to do with T385288.

I have limited the RAM for PetScan via systemctl, which also should restart PetScan after VM reboot. Please let me know if that doesn't take care of the problem.