linuxhiveclouderarapidminercloudera-quickstart-vm

Access Denied Issue with Radoop. Connecting RapidMiner with Cloudera Quickstart VM


I have stood up a Quickstart Cloudera VM on a PC with all services running allocating 14G of ram. On the desktop that the VM is running on(not in the vm) I installed RapidMiner in order to test out Radoop before it goes onto a production server. I used the "Import from Cluster Manager" from RapidMiner which retrieved the correct configuration from CHD. When I run the full test I am running into a Access denied when rapidminer tests if it can create a table etc on Hive.

Logs:

May 18, 2018 3:45:29 PM FINE: Hive query: SHOW TABLES
May 18, 2018 3:45:29 PM FINE: Hive query: set -v
May 18, 2018 3:45:32 PM INFO: Getting radoop_hive-v4.jar file from plugin jar...
May 18, 2018 3:45:32 PM INFO: Remote radoop_hive-v4.jar is up to date.
May 18, 2018 3:45:32 PM INFO: Getting radoop_hive-v4.jar file from plugin jar...
May 18, 2018 3:45:32 PM INFO: Remote radoop_hive-v4.jar is up to date.
May 18, 2018 3:45:32 PM FINE: Hive query: SHOW FUNCTIONS
May 18, 2018 3:45:33 PM INFO: Remote radoop-mr-8.2.1.jar is up to date.
May 18, 2018 3:45:33 PM FINE: Hive query: CREATE TABLE radoop__tmp_cloudera_1526672733223_qznjpj8 (a1 DOUBLE , a2 DOUBLE , a3 DOUBLE , a4 DOUBLE , id0 STRING  COMMENT 'role:"id" ', label0 STRING  COMMENT 'role:"label" ') ROW FORMAT DELIMITED FIELDS TERMINATED BY ';' STORED AS TEXTFILE
May 18, 2018 3:45:33 PM FINE: Hive query: LOAD DATA INPATH '/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew/' OVERWRITE INTO TABLE radoop__tmp_cloudera_1526672733223_qznjpj8
May 18, 2018 3:45:33 PM FINE: Hive query failed: LOAD DATA INPATH '/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew/' OVERWRITE INTO TABLE radoop__tmp_cloudera_1526672733223_qznjpj8
May 18, 2018 3:45:33 PM FINE: Error: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 20009 from org.apache.hadoop.hive.ql.exec.MoveTask. Access denied: Unable to move source hdfs://quickstart.cloudera:8020/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew to destinationhdfs://quickstart.cloudera:8020/user/hive/warehouse/radoop__tmp_cloudera_1526672733223_qznjpj8: Permission denied by sticky bit: user=hive, path="/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew":cloudera:supergroup:drwxrwxrwx, parent="/tmp/radoop/cloudera":cloudera:supergroup:drwxrwxrwt
May 18, 2018 3:45:33 PM FINER: Connecting to Hive. JDBC url: radoop_hive_0.13.0jdbc:hive2://192.168.100.113:10000/default
May 18, 2018 3:45:33 PM FINER: Connecting to Hive took 108 ms.
May 18, 2018 3:45:33 PM FINE: Hive query failed again, error: java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 20009 from org.apache.hadoop.hive.ql.exec.MoveTask. Access denied: Unable to move source hdfs://quickstart.cloudera:8020/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew to destination hdfs://quickstart.cloudera:8020/user/hive/warehouse/radoop__tmp_cloudera_1526672733223_qznjpj8: Permission denied by sticky bit: user=hive, path="/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew":cloudera:supergroup:drwxrwxrwx, parent="/tmp/radoop/cloudera":cloudera:supergroup:drwxrwxrwt
May 18, 2018 3:45:33 PM FINE: Error while processing statement: FAILED: Execution Error, return code 20009 from org.apache.hadoop.hive.ql.exec.MoveTask. Access denied: Unable to move source hdfs://quickstart.cloudera:8020/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew to destination hdfs://quickstart.cloudera:8020/user/hive/warehouse/radoop__tmp_cloudera_1526672733223_qznjpj8: Permission denied by sticky bit: user=hive, path="/tmp/radoop/cloudera/tmp_1526672733088_x0ldwew":cloudera:supergroup:drwxrwxrwx, parent="/tmp/radoop/cloudera":cloudera:supergroup:drwxrwxrwt
May 18, 2018 3:45:33 PM FINER: Connecting to Hive. JDBC url: radoop_hive_0.13.0jdbc:hive2://192.168.100.113:10000/default

Maybe this is just a configuration change I can make in CDH like modify the Hive config, or some way to allow RapidMiner to read/write.


Solution

  • Long story short: on Cloudera Quickstart 5.13 VM you should use the same username for "Hadoop username" on Global tab and "Hive username" and on Hive tab.