There are two main reasons:
- more granular limits for users to avoid hammering the cluster
- apply labels to GPU nodes
Current config for fair-scheduler:
elukey@an-master1001:~$ cat /etc/hadoop/conf/fair-scheduler.xml
<?xml version="1.0"?>
<allocations>
<queue name="nice">
<!--
The nice queue is for big long running jobs that don't need to finish
fast. Having this queue helps smaller requests to finish faster.
-->
<weight>1.0</weight>
<maxRunningApps>50</maxRunningApps>
<schedulingMode>fair</schedulingMode>
</queue>
<queue name="sequential">
<!--
Applications submitted to this queue will be run sequentially. This
is for heavy jobs that might be automatically scheduled concurrently
and are not concerned with timeliness.
-->
<weight>1.0</weight>
<maxRunningApps>1</maxRunningApps>
<schedulingMode>fifo</schedulingMode>
</queue>
<queue name="default">
<weight>2.0</weight>
<maxRunningApps>50</maxRunningApps>
<schedulingMode>fair</schedulingMode>
</queue>
<queue name="priority">
<!--
The priority queue is for non-adhoc jobs that should get some priority.
This queue has a higher weight than default, but will never preempt.
-->
<weight>10.0</weight>
<maxRunningApps>50</maxRunningApps>
<schedulingMode>fair</schedulingMode>
</queue>
<queue name="production">
<schedulingMode>fair</schedulingMode>
<aclSubmitApps>hdfs</aclSubmitApps>
<!--
The production queue has a higher priority than default,
and it will start killing (preempting) jobs in other queues
if it can't get its minimum share within 10 minutes, and
fair share within 30 minutes.
-->
<weight>10.0</weight>
<minSharePreemptionTimeout>600</minSharePreemptionTimeout>
<maxRunningApps>50</maxRunningApps>
<fairSharePreemptionThreshold>1800</fairSharePreemptionThreshold>
</queue>
<!-- essential jobs will aggressively preempt jobs in other queues -->
<queue name="essential">
<!--
Use FIFO for essential queue. We want jobs submitted
here to run in sequential order.
-->
<schedulingMode>fifo</schedulingMode>
<aclSubmitApps>hdfs</aclSubmitApps>
<!--
The essential queue has a much higher priority than production,
and it will start killing (preempting) jobs in other queues,
first after 60 seconds if it can't get its minimum share,
and then more after 5 minutes if it can't get its fair share.
-->
<weight>20.0</weight>
<minSharePreemptionTimeout>60</minSharePreemptionTimeout>
<fairSharePreemptionThreshold>300</fairSharePreemptionThreshold>
<maxRunningApps>50</maxRunningApps>
</queue>
</allocations>