we have small gpdb cluster. when i am trying to read external table using 'gphdfs'
Protocol from gpdb master.
Environment
Product Version Pivotal Greenplum (GPDB) 4.3.8.2 OS Centos 6.5
Getting Error:
prod=# select * from ext_table; ERROR: external table gphdfs protocol command ended with error. 16/10/05 14:42:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable (seg0 slice1 host.domain.com:40000 pid=25491)
DETAIL:
Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://path/to/hdfs
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340)
at com.
Command: 'gphdfs://path/to/hdfs'
External table tableame, file gphdfs://path/to/hdfs
We tried : Following link on Greenplum master machine https://discuss.pivotal.io/hc/en-us/articles/219403388-How-to-eliminate-error-message-WARN-util-NativeCodeLoader-Unable-to-load-native-hadoop-library-for-your-platform-with-gphdfs
Result of Command
It did not work after changing the content in "Hadoop-env.sh" as suggested in Link. Still throwing the same error.Do i need to restart the gpdb for affecting the changes "Hadoop-env.sh".
Or
Is there alternate way to handle gphdfs protocol error ?
Any help on it would be much appreciated ?
Attached is DDL for failing External Table
create external table schemaname.exttablename(
"ID" INTEGER,
time timestamp without time zone,
"SalesOrder" char(6),
"NextDetailLine" decimal(6),
"OrderStatus" char(1),
)
location('gphdfs://hadoopmster.com:8020/devgpdb/filename.txt') FORMAT 'text'
Could you please provide your external table DDL that was failing .Also please make sure the gpadmin user has permissions to the hdfs path to read and write the data. Thanks Pratheesh Nair