Page MenuHomePhabricator

PAWS kills active users servers that are not connected to a user session
Open, NormalPublicFeature

Description

RileyBot had Terminal 1 and Terminal 2 actively running one task each at maxlag, at which rate the bot would complete the tasks in about 9 hours (19,000 edits)

RyanBot had Terminal 1 and Terminal 2 actively running one task each at maxlag, at which rate the bot would complete the tasks in about three weeks (145,000 edits)

Throughout yesterday and today, both users have had their servers shut off several times stopping their running tasks. This usually happens after the task has been running for a few hours.

There would be no period of inactivity except for the scripts sleeping between edits due to maxlag.

Event Timeline

Chicocvenancio renamed this task from Actively running servers shut down unexpectedly to PAWS kills active users servers that are not connected to a user session.Mar 2 2018, 12:36 AM
Chicocvenancio triaged this task as Normal priority.
Chicocvenancio moved this task from Backlog to MVP (Most Valuable PAWS) on the PAWS board.
Chicocvenancio added a subscriber: Chicocvenancio.

This is a know behaviour/bug in jupyterhub. Looking at one of the issues debating this, it seems the activity tracking backbone has been built, it might not involve a great amount of work to develop a script that uses that to define activity in a different (better) way than how cull_idlle.py does it at the moment.

For reference:
PAWS uses a culler that will kill user servers that are not connected to a browser for more than one hour.

With the new culling behavior in upcoming 0.9 it will be possible to configure culling user servers that are disconnected from a network perspective (current behavior) or from a "server busy" perspective. Feedback on possible sane values welcome.

Chicocvenancio changed the subtype of this task from "Task" to "Feature Request".Mar 2 2019, 2:00 PM
Chicocvenancio edited projects, added PAWS; removed PAWS (JupyterHub 0.9).

When can we use it?

With the new culling behavior in upcoming 0.9 it will be possible to configure culling user servers that are disconnected from a network perspective (current behavior) or from a "server busy" perspective. Feedback on possible sane values welcome.

Toolforge is an alternative for long tasks.