Browsing posts in: yarn

adjust log4j log level for a class

In all adjustments you can do in log4j (and there’s a lot), you may want to adjust the verbosity level of a particular class.

For example, I wanted to decrease verbosity of CapacityScheduler because I had the ResourceManager log full of

2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,668 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...

I first found the CapacityScheduler “full name” length (meaning with the package) referenced under 

log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler=WARN

Restarted ResourceManager, and voilà!

 


Queues and capacity scheduler

Capacity scheduler allows you to create some queues at different levels, with allocating different ratio of usage.

At the beginning, you have only one queue, which is root.
All of the following is defined in conf/capacity-scheduler.xml (/etc/hadoop/conf.empty/capacity-scheduler.xml in my HDP 2.1.3) or in YARN Configs/Scheduler in Ambari.

scheduler_ambari

Let’s start with two queues : a “production” and a “development” queues, which are all root subqueues

Queues definition

yarn.scheduler.capacity.root.queues=prod,dev

Now, maybe we have 2 teams in the dev department : Enginers and Datascientists.
Let split the dev queue in two sub-queues  :

yarn.scheduler.capacity.root.dev.queues=eng,ds

Queues capacity

We now have to define the percentage capacity of all these queues; Note that the total must be 100 (the root capacity), if not you won’t be able to start that scheduler.

yarn.scheduler.capacity.root.capacity=100

yarn.scheduler.capacity.root.prod.capacity=70
yarn.scheduler.capacity.root.dev.capacity=30

yarn.scheduler.capacity.root.dev.eng.capacity=60
yarn.scheduler.capacity.root.dev.ds.capacity=40

So prod will have 70% of the cluster resources and dev will have 30%.
Not really, infact ! If a job is run in dev and there’s no use of prod, then dev will take 100% of the cluster.
This make sense, because we don’t want the cluster to be under-utilized.

As you can imagine, eng will take 60% of dev capacity, and is able to reach 100% of dev if ds is empty.

We may want to limit dev to a maximum extended capacity (default is so 100%) because we never want this queue to use too much resources.
For that purpose, use the max-capacity parameter

yarn.scheduler.capacity.root.dev.max-capacity=60

 Queues status

Must be set to RUNNING; If set to STOPPED then you won’t be able to submit new jobs to that queue.

yarn.scheduler.capacity.root.default.state=RUNNING
yarn.scheduler.capacity.root.dev.state=RUNNING

Queues ACLs

The most important thing to understand is that ACLs are inherited. That means that you can’t restrain permissions, only extends them !
Most common mistake is ACLs set to * (meaning all users) on the root level : consequently, any user will be able to submit jobs to any queue : default is

yarn.scheduler.capacity.root.default.acl_submit_applications=*

ACLs format is a bit tricky :

<username>,<username>,[...]
<username>SPACE<group>,[...]

Then, on each queue, you can set 3 parameters : acl_submit_applications, acl_administer_queue and acl_administer_jobs.

yarn.scheduler.capacity.root.dev.acl_administer_jobs=nick dev
yarn.scheduler.capacity.root.dev.acl_administer_queue=john dev
yarn.scheduler.capacity.root.dev.acl_submit_applications=* dev

Any user of dev group can submit jobs but only John an administer queue.

You can see the “real” authorizations in a terminal :

[vagrant@gw ~]$ su ambari-qa
[ambari-qa@gw ~]# mapred queue -showacls
Queue acls for user : ambari-qa
Queue Operations
=====================
root ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
dev ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
prod ADMINISTER_QUEUE,SUBMIT_APPLICATIONS
[root@gw ~]#

Of course, yarn.acl.enable has to be set to true

Another thing is you don’t have to restart YARN for each scheduler modification, except for deleting existing queues; If you’re only adding queues or adjusting some settings, just type a simple

[root@gw ~]# kinit -kt /etc/security/keytabs/yarn.headless.keytab yarn@EXAMPLE.COM
[root@gw ~]# yarn rmadmin -refreshQueues

You can see the queues in two ways :
– in the CLI

[root@gw ~]# mapred queue -list
15/03/05 09:13:11 INFO client.RMProxy: Connecting to ResourceManager at nn.example.com/240.0.0.11:8050
======================
Queue Name : dev
Queue State : running
Scheduling Info : Capacity: 60.000004, MaximumCapacity: 100.0, CurrentCapacity: 0.0
======================
Queue Name : ds
Queue State : running
Scheduling Info : Capacity: 30.000002, MaximumCapacity: 100.0, CurrentCapacity: 0.0
======================
Queue Name : eng
Queue State : running
Scheduling Info : Capacity: 70.0, MaximumCapacity: 100.0, CurrentCapacity: 0.0
======================
Queue Name : prod
Queue State : running
Scheduling Info : Capacity: 40.0, MaximumCapacity: 100.0, CurrentCapacity: 0.0

– in the UI : go to the ResourceManager UI (Ambari YARN/Quick links), then click on Scheduler :

scheduler