hadoopapache-pighadoop-2.7.2

Pig-0.16.0 on Hadoop 2.7.2 - ERROR 1002: Unable to store alias


I have just started with Pig learning for which I installed a pseudo distributed Hadoop 2.7.2 on Ubuntu 14.04 LTS with Pig version 0.16.0. Following are my configurations for PIG and Hadoop -

File: .bashrc

#===============================================================
# Hadoop Variable List

export JAVA_HOME=/usr/lib/jvm/java-9-oracle
export HADOOP_INSTALL=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export HADOOP_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native"

#===============================================================
# PIG variable
export PIG_HOME="/home/hadoop/pig"
export PIG_INSTALL="$PIG_HOME"
export PIG_CONF_DIR="$PIG_HOME/conf"
export PIG_CLASSPATH="$HADOOP_INSTALL/conf"
export HADOOPDIR="$HADOOP_INSTALL/conf"
export PATH="$PIG_HOME/bin:$PATH"

=======================

and following is the directory from which I'm executing pig

-rw-rw-r--  1 hadoop hadoop    540117 Jul 15 12:41 myfile.txt
hadoop@rajeev:~$ pwd
/home/hadoop

I copied this file to HDFS too

hadoop@rajeev:~$ hadoop fs -ls -R /user/hadoop
-rw-r--r--   1 hadoop supergroup     540117 2016-07-15 12:48 /user/hadoop/myfile.txt

Now... when I execute following commands in Grunt shell, it gives error!

grunt> a = load 'myfile.txt' as line;
grunt> store a into 'c.out';

2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],a[-1,-1] C:  R: 
2016-07-15 12:56:38,684 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-07-15 12:56:38,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:57:25,722 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1468556821972_0006 has failed! Stop running all dependent jobs
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-07-15 12:57:25,726 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,786 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,839 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2016-07-15 12:57:25,841 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.2   0.16.0  hadoop  2016-07-15 12:56:36 2016-07-15 12:57:25 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1468556821972_0006  a   MAP_ONLY    Message: Job failed!       hdfs://localhost:9001/user/hadoop/c.out,

Input(s):
Failed to read data from "hdfs://localhost:9001/user/hadoop/myfile.txt"

Output(s):
Failed to produce result in "hdfs://localhost:9001/user/hadoop/c.out"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1468556821972_0006


2016-07-15 12:57:25,842 [main] INFO    org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

I have tried resolving it through other means by just executing PIG in local mode and not in MapReduce mode but nothing seems to be working. Everytime these simple two commands are failing.

The error log file prints following message

2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases a
2016-07-15 12:56:38,670 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: a[1,4],a[-1,-1] C:  R: 
2016-07-15 12:56:38,684 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2016-07-15 12:56:38,685 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2016-07-15 12:56:53,959 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_1468556821972_0006]
2016-07-15 12:57:25,722 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1468556821972_0006 has failed! Stop running all dependent jobs
2016-07-15 12:57:25,722 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2016-07-15 12:57:25,726 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,786 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2016-07-15 12:57:25,839 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2016-07-15 12:57:25,841 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
2.7.2   0.16.0  hadoop  2016-07-15 12:56:36 2016-07-15 12:57:25 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1468556821972_0006  a   MAP_ONLY    Message: Job failed!     hdfs://localhost:9001/user/hadoop/c.out,

Input(s):
Failed to read data from "hdfs://localhost:9001/user/hadoop/myfile.txt"

Output(s):
Failed to produce result in "hdfs://localhost:9001/user/hadoop/c.out"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1468556821972_0006


2016-07-15 12:57:25,842 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

Request your help!


Solution

  • Specify the full path and the datatype for the field that you are loading to.

    a = load 'hdfs://localhost:9001/user/hadoop/myfile.txt' AS (line:chararray);
    store a into 'hdfs://localhost:9001/user/hadoop/c.out';