Hadoop log compression on the fly with log4j

Hadoop logs are as verbose and useful as heavy. From that last perspective, some want to zip their logs so they can maintain their /var/log partition under warnings.

Thanks to log4j, you can achieve that in 2 ways :

1. use the log4j-extras package

2. use the log4j2 package which contains (at least !) compression

Here I’ll use the first, using it for Hive logging :

  • Download the log4j-extras package
  • put the jar in the lib : either you want to put in for “global” Hadoop, or maybe here just for Hive, so put it in /usr/hdp/
  • now adjust log4j properties to use rolling.RollingFileAppender instead of DRFA (Daily Rolling File Appender) using Ambari (for the example, in Advanced hive-log4j of the Hive service configs) or in Hive log4j.properties

log4j.appender.request.layout = org.apache.log4j.PatternLayout
log4j.appender.request.layout.ConversionPattern=%d{ISO8601} %-5p [%t]: %c{2} (%F:%M(%L)) - %m%n

Remember to get over the DRFA lines by commenting or deleting the lines.

Restart components, and you have zipped DRFA on daily basis (yyyyMMdd)



  • Reply rose |


    Is it enough if we put the extra jar file in /usr/hdp/version/hadoop/lib or should we copy it to hive folder also? i have my jar file placed in the above location and also the necessary settings for appenders, but still the log files are not compressed

    • Reply administrator |

      Take a look on the startup message in the Hive logs so you’ll find the classpath, showing if your extra jar is included

So, what do you think ?

  • Time limit is exhausted. Please reload CAPTCHA.