Pivotal Knowledge Base

Follow

How to configure queues using YARN capacity-scheduler.xml

Environment

Product Version
PHD 1.x, 2.x 

Purpose

In this article, we would go through the process for setting up queues using YARN Capacity Scheduler. 

Procedure

Prerequisite

Before we setup the queue, note how to configure the amount of maximum memory to be utilized by YARN node managers. In order to configure a PHD cluster to utilize a specific amount of memory for YARN node managers, edit the parameter "yarn.nodemanager.resource.memory-mb" in yarn configuration file "/etc/gphd/hadoop/conf/yarn-site.xml". After the desired value has been defined, restart the YARN services to refresh with the modifications. 

yarn-site.xml
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>16384</value>
</property>

Note: In this example, 16 GB memory is assigned for utilization by YARN node managers per server.

1. Now define multiple queues depending on the requirement for the operations and give them a share of the clustered resources. However, in order to allow YARN to use capacity scheduler, define the parameter of "yarn.resourcemanager.scheduler.class" in yarn-site.xml to use CapacityScheduler. By default the value is set to use CapacityScheduler in PHD, so the queues can be defined directly.

 yarn-site.xml
<property>
    <name>yarn.resourcemanager.scheduler.class</name>
 <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
</property>

2. Setup the queues:

CapacityScheduler has a predefined queue called root. All queues in the system are children of the root queue. In capacity-scheduler.xml, parameter "yarn.scheduler.capacity.root.queues" is used to define the child queues; for example, to create 3 queues, specify the name of the queues in a comma separated list. 

<property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>alpha,beta,default</value>
    <description>
      The queues at the this level (root is the root queue).
    </description>
 </property>

3. With the above change being done, now specify the queue specific parameters. 

Note: Parameters denoting queue specific properties follow a standard set of naming convention and they include the name of the queue for which they are relevant.

Here is an example of general syntax: yarn.scheduler.capacity.<queue-path>.<parameter>
where :
<queue-path> : identifies the name of the queue.
<parameter>  : identifies the parameter whose value is being set.

Please refer to Apache Yarn Capacity Scheduler for the complete list of configurable parameters.

4. Set up parameters

1) Queue Resource Allocation related queue parameter: 

yarn.scheduler.capacity.<queue-path>.capacity
To set the percentage of cluster resource, which must be allocated to these resources, edit the value of the parameter yarn.scheduler.capacity.<queue-path>.capacity in capacity-scheduler.xml accordingly.

In the example below, the queues are set to use 50%, 30% and 20% of the allocated clustered resources, which was earlier set by "yarn.nodemanager.resource.memory-mb" per nodemanager.

 <property>
    <name>yarn.scheduler.capacity.root.alpha.capacity</name>
    <value>50</value>
    <description>Default queue target capacity.</description>
  </property>

<property>
    <name>yarn.scheduler.capacity.root.beta.capacity</name>
    <value>30</value>
    <description>Default queue target capacity.</description>
  </property>

<property>
    <name>yarn.scheduler.capacity.root.default.capacity</name>
    <value>20</value>
    <description>Default queue target capacity.</description>
  </property>

2) Queue Administration & Permissions related parameters: 

a) yarn.scheduler.capacity.<queue-path>.state
Enables the queue to allow jobs or applications to be submitted via them, the state of the queue must be RUNNING; otherwise an error occurs stating that queue is STOPPED. RUNNING and STOPPED are the permissible values for this parameter.

Example below:

  <property>
    <name>yarn.scheduler.capacity.root.alpha.state</name>
    <value>RUNNING</value>
    <description>
      The state of the default queue. State can be one of RUNNING or STOPPED.
    </description>
  </property>

<property>
   <name>yarn.scheduler.capacity.root.beta.state</name>
    <value>RUNNING</value>
    <description>
      The state of the default queue. State can be one of RUNNING or STOPPED.
    </description>
  </property>

<property>
    <name>yarn.scheduler.capacity.root.default.state</name>
    <value>RUNNING</value>
    <description>
      The state of the default queue. State can be one of RUNNING or STOPPED.
    </description>
  </property>

b) yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications 
Enables users to submit a job or application to a specific queue. Username and a group must be defined in a comma separated list. A special value of * allows all the users to submit jobs and applications to the queue.

Example format for specifying the list of users:

1) <value>user1,user2</value> : This indicates that user1 and user2 are allowed.
2) <value>user1,user2 group1,group2</value> : This indicates that user1, user2 and all the users from group1 & group2 are allowed.
3) <value>group1,group2</value>: This indicates that all the users from group1 & group2 are allowed.

First and foremost, define the value for the parameter as "hadoop,yarn,mapped,hdfs" for non-leaf root queue. This ensures that only the special users could use all the queues.

Since the child queues inherit the permissions of their root queue, and by default its "*", therefore if the list is not restricted at the root queue, all users may still be able to run jobs on any of the queues. By specifying "hadoop,yarn,mapped,hdfs" for non-leaf root queue, user access can be controlled based on specific child queues.

c) Non-Leaf Root queue: 

 <property>
    <name>yarn.scheduler.capacity.root.acl_submit_applications</name>
    <value>hadoop,yarn,mapred,hdfs</value>
    <description>
      The ACL of who can submit jobs to the root queue.
    </description>
  </property>

d) Child Queue under root queue and Leaf child queue: 

<property>
   <name>yarn.scheduler.capacity.root.alpha.acl_submit_applications</name>
   <value>sap_user hadoopusers</value>      <description>
      The ACL of who can submit jobs to the alpha queue.
    </description>
  </property>

<property>
    <name>yarn.scheduler.capacity.root.beta.acl_submit_applications</name>
    <value>bi_user,etl_user failgroup</value>
    <description>
      The ACL of who can submit jobs to the beta queue.
    </description>
  </property>

  <property>
    <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
    <value>adhoc_user hadoopusers</value>
    <description>
      The ACL of who can submit jobs to the default queue.
    </description>
  </property>

e) yarn.scheduler.capacity.<queue-path>.acl_administer_queue
To set the list of administrators, who could manage an application on a queue; set the username in a comma separated list for this parameter. A special value of * allows all the users to monitor an application running on a queue.

Define the properties as we defined for acl_submit_applications. Similar syntaxes are followed.

 <property>
    <name>yarn.scheduler.capacity.root.alpha.acl_administer_queue</name>
    <value>sap_user</value>
    <description>
      The ACL of who can administer jobs on the default queue.
    </description>
  </property>

 <property>
    <name>yarn.scheduler.capacity.root.beta.acl_administer_queue</name>
    <value>bi_user,etl_user</value>
    <description>
      The ACL of who can administer jobs on the default queue.
    </description>
  </property>

 <property>
    <name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
    <value>adhoc_user</value>
    <description>
      The ACL of who can administer jobs on the default queue.
    </description>
  </property>

There are "Running and Pending Application Limits" related to other queue parameters, which could also be defined.

3) Bringing the queues in effect:

Once the required parameters are defined in capacity-scheduler.xml file, run the command to bring the changes in effect.
yarn rmadmin -refreshQueues

Once the command runs properly, verify if the queues are setup using 2 options:

1) hadoop queue -list

[root@phd11-nn ~]# hadoop queue -list
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

14/01/16 22:10:25 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
14/01/16 22:10:25 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
======================
Queue Name : alpha
Queue State : running
Scheduling Info : Capacity: 50.0, MaximumCapacity: 1.0, CurrentCapacity: 0.0
======================
Queue Name : beta
Queue State : running
Scheduling Info : Capacity: 30.0, MaximumCapacity: 1.0, CurrentCapacity: 0.0
======================
Queue Name : default
Queue State : running
Scheduling Info : Capacity: 20.0, MaximumCapacity: 1.0, CurrentCapacity: 0.0

2) Open YARN resourcemanager GUI from Resource Manager GUI: http://<Resouremanager-hostname>:8088, click Scheduler
8088 is the default port and replace <Resouremanager-hostname> with the hostname as per your PHD cluster. Below is an example for the same depicting one of the queue created "alpha"

Execute a Hadoop job by submitting to a specific queue:

Before executing any Hadoop job, use the command to identify the queue names to submit at.

[fail_user@phd11-nn ~]$ id
uid=507(fail_user) gid=507(failgroup) groups=507(failgroup)

[fail_user@phd11-nn ~]$ hadoop queue -showacls
Queue acls for user :  fail_user

Queue  Operations
=====================
root  ADMINISTER_QUEUE
alpha  ADMINISTER_QUEUE
beta  ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
default  ADMINISTER_QUEUE

If you see the above output in 'fail_user' submit application only on beta queue, it would be a part of "failgroup" and have been assigned only to beta queue in capacity-scheduler.xml as described earlier.

To submit an application use the parameter Dmapred.job.queue.name=<queue-name> or -Dmapred.job.queuename=<queue-name>

The below examples show how to run a job on a specific queue.

[fail_user@phd11-nn ~]$ yarn jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha-gphd-2.1.1.0.jar wordcount -D mapreduce.job.queuename=beta /tmp/test_input /user/fail_user/test_output
14/01/17 23:15:31 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
14/01/17 23:15:31 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
14/01/17 23:15:31 INFO input.FileInputFormat: Total input paths to process : 1
14/01/17 23:15:31 INFO mapreduce.JobSubmitter: number of splits:1
In DefaultPathResolver.java. Path = hdfs://phda2/user/fail_user/test_output
14/01/17 23:15:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390019915506_0001
14/01/17 23:15:33 INFO client.YarnClientImpl: Submitted application application_1390019915506_0001 to ResourceManager at phd11-nn.saturn.local/10.110.127.195:8032
14/01/17 23:15:33 INFO mapreduce.Job: The url to track the job: http://phd11-nn.saturn.local:8088/proxy/application_1390019915506_0001/
14/01/17 23:15:33 INFO mapreduce.Job: Running job: job_1390019915506_0001
2014-01-17T23:15:40.702-0800: 11.670: [GC2014-01-17T23:15:40.702-0800: 11.670: [ParNew: 272640K->18064K(306688K), 0.0653230 secs] 272640K->18064K(989952K), 0.0654490 secs] [Times: user=0.06 sys=0.04, real=0.06 secs]
14/01/17 23:15:41 INFO mapreduce.Job: Job job_1390019915506_0001 running in uber mode : false
14/01/17 23:15:41 INFO mapreduce.Job:  map 0% reduce 0%
14/01/17 23:15:51 INFO mapreduce.Job:  map 100% reduce 0%
14/01/17 23:15:58 INFO mapreduce.Job:  map 100% reduce 100%
14/01/17 23:15:58 INFO mapreduce.Job: Job job_1390019915506_0001 completed successfully

While the job is executing, monitor the Resource Manger GUI to see queue the job submitted to. Here is a snapshot of the name. In the snapshot below, green color indicates the queue which is being used by the above WordCount application.



See what happens when another queue is used on which fail_user is not allowed to submit applications; the code fails.

 [fail_user@phd11-nn ~]$ yarn jar /usr/lib/gphd/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.5-alpha-gphd-2.1.1.0.jar wordcount -D mapreduce.job.queuename=alpha /tmp/test_input /user/fail_user/test_output_alpha
14/01/17 23:20:07 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
14/01/17 23:20:07 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
14/01/17 23:20:07 INFO input.FileInputFormat: Total input paths to process : 1
14/01/17 23:20:07 INFO mapreduce.JobSubmitter: number of splits:1
In DefaultPathResolver.java. Path = hdfs://phda2/user/fail_user/test_output_alpha
14/01/17 23:20:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1390019915506_0002
14/01/17 23:20:08 INFO client.YarnClientImpl: Submitted application application_1390019915506_0002 to ResourceManager at phd11-nn.saturn.local/10.110.127.195:8032
14/01/17 23:20:08 INFO mapreduce.JobSubmitter: Cleaning up the staging area /user/fail_user/.staging/job_1390019915506_0002
14/01/17 23:20:08 ERROR security.UserGroupInformation: PriviledgedActionException as:fail_user (auth:SIMPLE) cause:java.io.IOException: Failed to run job : org.apache.hadoop.security.AccessControlException: User fail_user cannot submit applications to queue root.alpha
java.io.IOException: Failed to run job : org.apache.hadoop.security.AccessControlException: User fail_user cannot submit applications to queue root.alpha
     at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:307)
     at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:395)
     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
     at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
     at java.security.AccessController.doPrivileged(Native Method)
     at javax.security.auth.Subject.doAs(Subject.java:415)
     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1478)
     at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
     at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1236)
     at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:606)
     at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:144)
     at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:68)
     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
     at java.lang.reflect.Method.invoke(Method.java:606)
     at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Additional Information

Please refer to Apache Yarn Capacity Scheduler for the complete list of configurable parameters.

Comments

Powered by Zendesk