Hadoop logs are as verbose and useful as heavy. From that last perspective, some want to zip their logs so they can maintain their /var/log partition under warnings.
Thanks to log4j, you can achieve that in 2 ways :
1. use the log4j-extras package
2. use the log4j2 package which contains (at least !) compression
Here I’ll use the first, using it for Hive logging :
- Download the log4j-extras package
- put the jar in the lib : either you want to put in for “global” Hadoop, or maybe here just for Hive, so put it in /usr/hdp/2.2.4.2-2/hive/lib/
- now adjust log4j properties to use rolling.RollingFileAppender instead of DRFA (Daily Rolling File Appender) using Ambari (for the example, in Advanced hive-log4j of the Hive service configs) or in Hive log4j.properties
hive.root.logger=INFO,request
log4j.appender.request=org.apache.log4j.rolling.RollingFileAppender
log4j.appender.request.File=${hive.log.dir}/${hive.log.file}
log4j.appender.request.RollingPolicy=org.apache.log4j.rolling.TimeBasedRollingPolicy
log4j.appender.request.RollingPolicy.ActiveFileName=${hive.log.dir}/${hive.log.file}
log4j.appender.request.RollingPolicy.FileNamePattern=${hive.log.dir}/${hive.log.file}.%d{yyyyMMdd}.log.gz
log4j.appender.request.layout = org.apache.log4j.PatternLayout
log4j.appender.request.layout.ConversionPattern=%d{ISO8601} %-5p [%t]: %c{2} (%F:%M(%L)) - %m%n
Remember to get over the DRFA lines by commenting or deleting the lines.
Restart components, and you have zipped DRFA on daily basis (yyyyMMdd)