Posts tagged with: log4j

Hadoop log compression on the fly with log4j

Hadoop logs are as verbose and useful as heavy. From that last perspective, some want to zip their logs so they can maintain their /var/log partition under warnings.

Thanks to log4j, you can achieve that in 2 ways :

1. use the log4j-extras package

2. use the log4j2 package which contains (at least !) compression

Here I’ll use the first, using it for Hive logging :

  • Download the log4j-extras package
  • put the jar in the lib : either you want to put in for “global” Hadoop, or maybe here just for Hive, so put it in /usr/hdp/2.2.4.2-2/hive/lib/
  • now adjust log4j properties to use rolling.RollingFileAppender instead of DRFA (Daily Rolling File Appender) using Ambari (for the example, in Advanced hive-log4j of the Hive service configs) or in Hive log4j.properties
hive.root.logger=INFO,request

log4j.appender.request=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.request.File=${hive.log.dir}/${hive.log.file}
log4j.appender.request.RollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.request.RollingPolicy.ActiveFileName=${hive.log.dir}/${hive.log.file}
log4j.appender.request.RollingPolicy.FileNamePattern=${hive.log.dir}/${hive.log.file}.%d{yyyyMMdd}.log.gz
log4j.appender.request.layout = org.apache.log4j.PatternLayout
log4j.appender.request.layout.ConversionPattern=%d{ISO8601} %-5p [%t]: %c{2} (%F:%M(%L)) - %m%n

Remember to get over the DRFA lines by commenting or deleting the lines.

Restart components, and you have zipped DRFA on daily basis (yyyyMMdd)

 


adjust log4j log level for a class

In all adjustments you can do in log4j (and there’s a lot), you may want to adjust the verbosity level of a particular class.

For example, I wanted to decrease verbosity of CapacityScheduler because I had the ResourceManager log full of

2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,667 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...
2015-12-15 12:11:06,668 INFO capacity.CapacityScheduler (CapacityScheduler.java:completedContainer(1190)) - Null container completed...

I first found the CapacityScheduler “full name” length (meaning with the package) referenced under 

log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler=WARN

Restarted ResourceManager, and voilà!