Page MenuHomePhabricator

Limit ability of a single user/tool to overwhealm job grid
Closed, DuplicatePublic

Description

In T196486: Concurrent generated jobs from a single user overloaded grid engine we found that a well intentioned user had started hundreds of parallel jobs on the grid. We added a specific configuration to keep that particular user from doing this again. It would be nice however to have a better circuit breaker to stop this from happening again easily.

The resource quota system looks like it would require us to list each user specifically. There is however the maxujobs global scheduler setting:

maxujobs

The maximum number of jobs any user may have running in a Sun Grid Engine cluster at the same time. If set to 0 (default) the users may run an arbitrary number of jobs.

We currently have this set in the grid config with a value of 1000 which coincidentally(?) is also the upper limit on jobs per queue. This means that the limit functionally will never take effect.

Event Timeline

Vvjjkkii renamed this task from Limit ability of a single user/tool to overwhealm job grid to 4kbaaaaaaa.Jul 1 2018, 1:06 AM
Vvjjkkii triaged this task as High priority.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed a subscriber: Aklapper.
CommunityTechBot renamed this task from 4kbaaaaaaa to Limit ability of a single user/tool to overwhealm job grid.Jul 1 2018, 8:31 PM
CommunityTechBot raised the priority of this task from High to Needs Triage.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added a subscriber: Aklapper.