There are two main reasons:
- more granular limits for users to avoid hammering the cluster
- apply labels to GPU nodes
Current config for fair-scheduler:
elukey@an-master1001:~$ cat /etc/hadoop/conf/fair-scheduler.xml <?xml version="1.0"?> <allocations> <queue name="nice"> <!-- The nice queue is for big long running jobs that don't need to finish fast. Having this queue helps smaller requests to finish faster. --> <weight>1.0</weight> <maxRunningApps>50</maxRunningApps> <schedulingMode>fair</schedulingMode> </queue> <queue name="sequential"> <!-- Applications submitted to this queue will be run sequentially. This is for heavy jobs that might be automatically scheduled concurrently and are not concerned with timeliness. --> <weight>1.0</weight> <maxRunningApps>1</maxRunningApps> <schedulingMode>fifo</schedulingMode> </queue> <queue name="default"> <weight>2.0</weight> <maxRunningApps>50</maxRunningApps> <schedulingMode>fair</schedulingMode> </queue> <queue name="priority"> <!-- The priority queue is for non-adhoc jobs that should get some priority. This queue has a higher weight than default, but will never preempt. --> <weight>10.0</weight> <maxRunningApps>50</maxRunningApps> <schedulingMode>fair</schedulingMode> </queue> <queue name="production"> <schedulingMode>fair</schedulingMode> <aclSubmitApps>hdfs</aclSubmitApps> <!-- The production queue has a higher priority than default, and it will start killing (preempting) jobs in other queues if it can't get its minimum share within 10 minutes, and fair share within 30 minutes. --> <weight>10.0</weight> <minSharePreemptionTimeout>600</minSharePreemptionTimeout> <maxRunningApps>50</maxRunningApps> <fairSharePreemptionThreshold>1800</fairSharePreemptionThreshold> </queue> <!-- essential jobs will aggressively preempt jobs in other queues --> <queue name="essential"> <!-- Use FIFO for essential queue. We want jobs submitted here to run in sequential order. --> <schedulingMode>fifo</schedulingMode> <aclSubmitApps>hdfs</aclSubmitApps> <!-- The essential queue has a much higher priority than production, and it will start killing (preempting) jobs in other queues, first after 60 seconds if it can't get its minimum share, and then more after 5 minutes if it can't get its fair share. --> <weight>20.0</weight> <minSharePreemptionTimeout>60</minSharePreemptionTimeout> <fairSharePreemptionThreshold>300</fairSharePreemptionThreshold> <maxRunningApps>50</maxRunningApps> </queue> </allocations>