Pivotal Knowledge Base

Follow

Hive LOAD from Zeppelin fails with "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask"

Environment

HDP 2.5. Applicable to all supported HDP versions

Symptom

Hive LOAD operation executed from Zeppelin fails with the error "Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask". However, the table is loaded with the correct data.

Error Message:

This is the error message reported in Zeppelin Notebook:

java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask 
at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:283) 
at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) 
at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:291) 
at org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInterpreter.java:577) 
at org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInterpreter.java:660) 
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:94) 
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:489) 
at org.apache.zeppelin.scheduler.Job.run(Job.java:175) 
at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) 
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) 
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
at java.lang.Thread.run(Thread.java:745)

We can verify that the input file permissions are: hive:hdfs. The following error appears in Hive Server logs:

2017-10-17 16:21:21,714 ERROR [HiveServer2-Background-Pool: Thread-384264]: exec.Task (SessionState.java:printError(948)) - Failed with exception org.apache.hadoop.security.AccessControlException: Permission denied. user=anonymous is not the owner of inode=notiz.csv
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkOwner(FSPermissionChecker.java:250)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:227)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkOwner(FSDirectory.java:1780)
        at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setPermission(FSDirAttrOp.java:63)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setPermission(FSNamesystem.java:1685)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setPermission(NameNodeRpcServer.java:814)
        at [...]

We can observe that Zeppelin is connecting to Hive as user "anonymous" (scroll the last log snippet to the right)

Cause

Zeppelin is connecting to using JDBC Driver and the user is "anonymous". In the JDBC Driver configuration for Zeppelin to connect to Hive, we have to set hive.user to hive. The configuration of hive.username is ignored, it must be hive.user

Additional Information

Hive configuration in Zeppelin

Comments

Powered by Zendesk