Pivotal Knowledge Base

Follow

Oozie hive or mapreduce actions fail in PHD 1.0.1

Version PHD 1.0.1 ( Apache 2.0.2 )
Version PHD 1.1.0 ( Apache 2.0.5 )
Oozie 3.3.2


Related Apached JIRAs

https://issues.apache.org/jira/browse/OOZIE-1089
https://issues.apache.org/jira/browse/OOZIE-1290
https://issues.apache.org/jira/browse/MAPREDUCE-4503
https://issues.apache.org/jira/browse/MAPREDUCE-4549
https://issues.apache.org/jira/browse/MAPREDUCE-4820


Symptom

When executing Oozie Hive action against PHD 1.0.1 the action will fail because of the yarn distributed cache duplicate check.  The hive action will attempt to import jar files that are already exist in the resources for the container context and yarn will throw an exception.   The Oozie action will fail as a result of the exception thrown.

The Oozie workaround described in OOZIE-1089 introduces this param to oozie-site.xml.  This changes the Oozie behavior so it will not import jar files to the launcher application allowing the hive job to execute.
<property>
  <name>oozie.hadoop-2.0.2-alpha.workaround.for.distributed.cache</name>
    <value>true</value>
    <description>
      Due to a bug in Hadoop 2.0.2-alpha, MAPREDUCE-4820, launcher jobs fail to set
      the distributed cache for the action job because the local JARs are implicitly
      included triggering a duplicate check.
      This flag removes the distributed cache files for the action as they'll be
      included from the local JARs of the JobClient (MRApps) submitting the action
      job from the launcher.
    </description>
</property>

While the param will allow the hive action to complete this will break MapReduce actions from executing.  The MapReduce jar file will not get imported during to the launcher task and therefore will not be imported as a resource.  The launcher task will not be able to find the mapper/reducer classes resulting in a ClassNotFoundException.

Workaround for Mapreduce in PHD 1.0.1
Now that you have implemented the workaround for Hive by setting "oozie.hadoop-2.0.2-alpha.workaround.for.distributed.cache" you will need to modify your MapReduce workflow files to include -libjars that includes HDFS references to all of the jar files required for the MapReduce application.

<workflow-app xmlns="uri:oozie:workflow:0.1" name="java-wf">
    <start to="java1"/>
    <action name="java1">
        <java>
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
                <property>
                    <name>mapred.job.queue.name</name>
                    <value>${queueName}</value>
                </property>
            </configuration>
	    <main-class>org.gopivotal.hadoop.mapreduce.MaxTemperature</main-class>
                <arg>-libjars</arg>
                <arg>hdfs://hdm1.hadoop.local:8020/user/oozie/examples/apps/map-reduce2/lib/PHD_MR.jar</arg>
                <arg>/user/oozie/mr/input</arg>
                <arg>/user/oozie//mr/output</arg>
        </java>
        <ok to="end"/>
        <error to="fail"/>
    </action>
    <kill name="fail">
        <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
    </kill>
    <end name="end"/>
</workflow-app>

Fix

The yarn duplicate check bug is fixed in Apache 2.0.4 and above ( PHD 1.1.0 ).  Instead of throwing an exception when there is a duplicate file in distributed cache it will throw a warning.  This allows the hive action to succeed. Eliminating the need for the "oozie.hadoop-2.0.2-alpha.workaround.for.distributed.cache" oozie workaround that breaks MapReduce.  

The permanent fix is to upgrade to PHD 1.1.0 where Oozie 3.3.2 is included in the distribution and fully supported. 

Example warning message when hive action executes:
2013-11-20 13:06:36,755 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/hive/commons-codec-1.4.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/oozie/commons-codec-1.4.jar This will be an error in Hadoop 2.0
2013-11-20 13:06:36,757 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/hive/hadoop-auth-2.0.5-alpha-gphd-2.1.0.0.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/oozie/hadoop-auth-2.0.5-alpha-gphd-2.1.0.0.jar This will be an error in Hadoop 2.0
2013-11-20 13:06:36,760 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/hive/log4j-1.2.16.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/oozie/log4j-1.2.16.jar This will be an error in Hadoop 2.0
2013-11-20 13:06:36,762 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/hive/slf4j-api-1.5.8.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/oozie/slf4j-api-1.5.8.jar This will be an error in Hadoop 2.0
2013-11-20 13:06:36,765 WARN [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.util.MRApps: cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/hive/slf4j-log4j12-1.5.8.jar conflicts with cache file (mapreduce.job.cache.files) hdfs://hdm1.hadoop.local:8020/user/oozie/share/lib/oozie/slf4j-log4j12-1.5.8.jar This will be an error in Hadoop 2.0
2013-11-20 13:06:36,767 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Adding #2 tokens and #0 secret keys for NM use for launching container




 

Comments

Powered by Zendesk