Page MenuHomePhabricator

Spark sessions can provision kerberos tickets in a more predictable manner
Closed, ResolvedPublic

Description

Spark sessions can provision kerberos tickets in a more predictable manner.

Right now kerberos tickets are expiring after 24 hours, this leaves spark sessions hanging after 24 hours and not able to move forward. We should probably extend this timeframe and document the period in which kerberos ticket expires.

At the same time there are plenty jupyter notebooks that are left idle, not used, and still consuming resources.

We need to :

  • support longer running jobs in jupyter (3/4 days)
  • kill jupyter notebooks that are not being used

Event Timeline

Nuria created this task.Feb 25 2020, 5:37 PM
Restricted Application added a subscriber: Aklapper. · View Herald TranscriptFeb 25 2020, 5:37 PM
Nuria assigned this task to Ottomata.Feb 25 2020, 5:39 PM

Let's also document clearly how to re-start the kernel

Milimetric reassigned this task from Ottomata to elukey.Mar 2 2020, 4:55 PM
Milimetric triaged this task as High priority.
Milimetric moved this task from Incoming to Operational Excellence on the Analytics board.
Milimetric added a project: Analytics-Kanban.

Right now kerberos tickets are expiring after 24 hours, this leaves spark sessions hanging after 24 hours and not able to move forward. We should probably extend this timeframe and document the period in which kerberos ticket expires.

Is this done now that Kerberos tickets expire after 48 hours?

kill jupyter notebooks that are not being used

Probably best to file a separate ticket about your dark, murderous plans 😉

@Nuria I am wondering what is best for this task. The ticket last 48h now, but we haven't thought about any hard rule about max lifetime for a notebook. Should we keep going in here or file another task in the future if needed? Might be something to add to Newpyter, @Ottomata thoughts?

In newpyter we will likely try some idle notebook culler, so if that is what you are thinking of, we can do it as part of that task.

Nuria added a comment.May 13 2020, 2:22 PM

I think is fine to close knowing that tickets expire at 48h . This in fact means that the notebook 'dies' after 48 h which seems and arbitrary time but also I do not see how it could be much longer.

I think is fine to close knowing that tickets expire at 48h . This in fact means that the notebook 'dies' after 48 h which seems and arbitrary time but also I do not see how it could be much longer.

Actually, I do regularly need to kinit, but I've never had an running notebook be unable to accept new commands after I do so (and I haven't heard of other members of my team experiencing this either). So, although I haven't been paying close attention, I don't think notebooks actually "die" in this way.

Killing notebooks to save resources is a separate issue and I definitely want to discuss the exact parameters when you're ready to move forward with it. For example, I feel like I would prefer a fixed memory quota over having my notebooks get automatically killed even if they are using little memory, but there are downsides to that too.

elukey closed this task as Resolved.Thu, Jun 11, 6:18 AM

Let's discuss the killing notebooks part in Newpyter (as it is already happening).

Aklapper removed a project: Analytics.Sat, Jul 4, 7:59 AM