jarmapreducehbaseapache-pighadoop-yarn

STORE relation problem using pig -x local problem, failed to read data


1st approach: Using pig -x mapreduce

Hbase table is created:
hbase(main):003:0> list
TABLE                                                                                                                                                                                                              
clientes                                                                                                                                                                                                           
1 row(s)
Took 0.0047 seconds                                                                                                                                                                                                
=> ["clientes"]
grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
            id:chararray,
            nome:chararray,
            sobrenome:chararray,
            idade:int,
            funcao:chararray
);

2021-03-07 19:00:32,390 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_1615152557282_0002
2021-03-07 19:00:32,390 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:00:32,390 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C:  R: 
2021-03-07 19:00:32,395 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:00:37,406 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:00:37,406 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_1615152557282_0002 has failed! Stop running all dependent jobs
2021-03-07 19:00:37,406 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:00:37,410 [main] INFO  org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032
2021-03-07 19:00:37,492 [main] INFO  org.apache.hadoop.mapred.ClientServiceDelegate - Could not get Job info from RM for job job_1615152557282_0002. Redirecting to job history server.
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
2021-03-07 19:00:37,595 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:00:37,597 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
3.2.2   0.17.0  hadoop  2021-03-07 19:00:31 2021-03-07 19:00:37 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_1615152557282_0002  dados   MAP_ONLY    Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:294)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
    at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1565)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1562)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1562)
    at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.pig.backend.hadoop.PigJobControl.submit(PigJobControl.java:128)
    at org.apache.pig.backend.hadoop.PigJobControl.run(PigJobControl.java:205)
    at java.lang.Thread.run(Thread.java:748)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:301)
Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:9000/user/hadoop, expected: file:///
    at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:737)
    at org.apache.hadoop.fs.RawLocalFileSystem.setWorkingDirectory(RawLocalFileSystem.java:604)
    at org.apache.hadoop.fs.FilterFileSystem.setWorkingDirectory(FilterFileSystem.java:307)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:250)
    ... 18 more
    hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722,

Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"

Output(s):
Failed to produce result in "hdfs://localhost:9000/tmp/temp-1169299097/tmp-2103156722"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_1615152557282_0002


2021-03-07 19:00:37,597 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2021-03-07 19:00:37,601 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias dados. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154395936.log

2nd approach: Using pig -x local (dump dados works)

grunt> dados = LOAD 'file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt' USING PigStorage(',') AS (
>> id:chararray,
>> nome:chararray,
>> sobrenome:chararray,
>> idade:int,
>> funcao:chararray
>> );

2021-03-07 19:02:17,219 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.builtin.PigStorage - Using PigTextInputFormat
2021-03-07 19:02:17,222 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt:0+794
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 2
2021-03-07 19:02:17,226 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
2021-03-07 19:02:17,241 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:02:17,243 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:02:17,253 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C:  R: 
2021-03-07 19:02:17,266 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2021-03-07 19:02:17,274 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task:attempt_local116575577_0001_m_000000_0 is done. And is in the process of committing
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - 
2021-03-07 19:02:17,280 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task attempt_local116575577_0001_m_000000_0 is allowed to commit now
2021-03-07 19:02:17,285 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local116575577_0001_m_000000_0' to file:/tmp/temp2133275539/tmp1539690224
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - map
2021-03-07 19:02:17,286 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Task 'attempt_local116575577_0001_m_000000_0' done.
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.Task - Final Counters for attempt_local116575577_0001_m_000000_0: Counters: 16
    File System Counters
        FILE: Number of bytes read=1264
        FILE: Number of bytes written=530456
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
    Map-Reduce Framework
        Map input records=20
        Map output records=20
        Input split bytes=414
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=0
        Total committed heap usage (bytes)=311427072
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=0
    org.apache.pig.PigWarning
        FIELD_DISCARDED_TYPE_CONVERSION_FAILED=1
2021-03-07 19:02:17,291 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local116575577_0001_m_000000_0
2021-03-07 19:02:17,291 [Thread-7] INFO  org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:02:17,485 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,492 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
2021-03-07 19:02:17,492 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2021-03-07 19:02:17,493 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,536 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:02:17,540 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
3.2.2   0.17.0  hadoop  2021-03-07 19:02:16 2021-03-07 19:02:17 UNKNOWN

Success!

Job Stats (time in seconds):
JobId   Maps    Reduces MaxMapTime  MinMapTime  AvgMapTime  MedianMapTime   MaxReduceTime   MinReduceTime   AvgReduceTime   MedianReducetime    Alias   Feature Outputs
job_local116575577_0001 1   0   n/a n/a n/a n/a 0   0   0   0   dados   MAP_ONLY    file:/tmp/temp2133275539/tmp1539690224,

Input(s):
Successfully read 20 records from: "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"

Output(s):
Successfully stored 20 records in: "file:/tmp/temp2133275539/tmp1539690224"

Counters:
Total records written : 20
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local116575577_0001


2021-03-07 19:02:17,542 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,544 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,551 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:02:17,558 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2021-03-07 19:02:17,558 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Success!
2021-03-07 19:02:17,563 [main] INFO  org.apache.hadoop.conf.Configuration.deprecation - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2021-03-07 19:02:17,563 [main] WARN  org.apache.pig.data.SchemaTupleBackend - SchemaTupleBackend has already been initialized
2021-03-07 19:02:17,570 [main] INFO  org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input files to process : 1
2021-03-07 19:02:17,570 [main] INFO  org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
(id,nome,sobrenome,,funcao)
(c001,Josias,Silva,55,Analista de Mercado)
(1100002,Pedro,Malan,74,Professor)
(1100003,Maria,Maciel,34,Bombeiro)
(1100004,Suzana,Bustamante,66,Analista de TI)
(1100005,Karen,Moreira,74,Advogado)
(1100006,Patricio,Teixeira,42,Veterinario)
(1100007,Elisa,Haniero,43,Piloto)
(1100008,Mauro,Bender,63,Marceneiro)
(1100009,Mauricio,Wagner,39,Artista)
(1100010,Douglas,Macedo,60,Escritor)
(1100011,Francisco,McNamara,47,Cientista de Dados)
(1100012,Sidney,Raynor,26,Escritor)
(1100013,Maria,Moon,41,Gerente de Projetos)
(1100014,Bete,Balanaira,65,Musico)
(1100015,Julia,Peixoto,49,Especialista em TI)
(1100016,Jeronimo,Wallace,52,Engenheiro de Dados)
(1100017,Noeli,Laura,72,Cientista de Dados)
(1100018,Jean,Junior,45,Desenvolvedor RPA)
(1100019,Cristina,Garbim,63,Engenheiro Blockchain)

But STORE dados INTO 'hbase://clientes' or STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' fails:

grunt> STORE dados INTO 'hbase://clientes' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
2021-03-07 19:03:51,347 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1289080477_0002
2021-03-07 19:03:51,347 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:03:51,347 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C:  R: 
2021-03-07 19:03:51,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:03:51,349 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Running jobs are [job_local1289080477_0002]
2021-03-07 19:03:51,835 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.hbase.mapreduce.TableOutputFormat - Created table instance for clientes
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.impl.util.SpillableMemoryManager - Selected heap (PS Old Gen) of size 699400192 to monitor. collectionUsageThreshold = 489580128, usageThreshold = 489580128
2021-03-07 19:03:51,839 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code.
2021-03-07 19:03:51,843 [LocalJobRunner Map Task Executor #0] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: dados[1,8],dados[-1,-1] C:  R: 
2021-03-07 19:03:51,860 [LocalJobRunner Map Task Executor #0] INFO  org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation - Closing zookeeper sessionid=0x1780e985b4d000f
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0] INFO  org.apache.zookeeper.ZooKeeper - Session: 0x1780e985b4d000f closed
2021-03-07 19:03:51,866 [LocalJobRunner Map Task Executor #0-EventThread] INFO  org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 0x1780e985b4d000f
2021-03-07 19:03:51,867 [Thread-10] INFO  org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2021-03-07 19:03:51,870 [Thread-10] WARN  org.apache.hadoop.mapred.LocalJobRunner - job_local1289080477_0002
java.lang.Exception: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:83)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:144)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:670)
    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:282)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:275)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:65)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
    at java.util.ArrayList.rangeCheck(ArrayList.java:657)
    at java.util.ArrayList.get(ArrayList.java:433)
    at org.apache.pig.backend.hadoop.hbase.HBaseStorage.putNext(HBaseStorage.java:992)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.StoreFuncDecorator.putNext(StoreFuncDecorator.java:75)
    ... 18 more
2021-03-07 19:03:52,055 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:03:52,055 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1289080477_0002 has failed! Stop running all dependent jobs
2021-03-07 19:03:52,055 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:03:52,056 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:03:52,057 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:03:52,058 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
3.2.2   0.17.0  hadoop  2021-03-07 19:03:50 2021-03-07 19:03:52 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_local1289080477_0002    dados   MAP_ONLY    Message: Job failed!    hbase://clientes,

Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"

Output(s):
Failed to produce result in "hbase://clientes"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1289080477_0002


2021-03-07 19:03:52,058 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
grunt> STORE dados INTO 'file:///home/hadoop/hadloop/pig_output' USING
>> org.apache.pig.backend.hadoop.hbase.HBaseStorage(
>> 'dados_clientes:nome
>> dados_clientes:sobrenome
>> dados_clientes:idade
>> dados_clientes:funcao'
>> );
java.lang.Exception: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:552)
Caused by: java.lang.IllegalArgumentException: Illegal character code:47, </> at 0. User-space table qualifiers can only contain 'alphanumeric characters': i.e. [a-zA-Z_0-9-.]: ///home/hadoop/hadloop/pig_output
    at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:196)
    at org.apache.hadoop.hbase.TableName.isLegalTableQualifierName(TableName.java:149)
    at org.apache.hadoop.hbase.TableName.<init>(TableName.java:322)
    at org.apache.hadoop.hbase.TableName.createTableNameIfNecessary(TableName.java:358)
    at org.apache.hadoop.hbase.TableName.valueOf(TableName.java:449)
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.<init>(TableOutputFormat.java:107)
    at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.getRecordWriter(TableOutputFormat.java:153)
    at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat.getRecordWriter(PigOutputFormat.java:83)
    at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:659)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
2021-03-07 19:05:10,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local1458581109_0003
2021-03-07 19:05:10,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases dados
2021-03-07 19:05:10,476 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: dados[1,8],dados[-1,-1] C:  R: 
2021-03-07 19:05:10,477 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2021-03-07 19:05:10,477 [main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2021-03-07 19:05:10,477 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local1458581109_0003 has failed! Stop running all dependent jobs
2021-03-07 19:05:10,478 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2021-03-07 19:05:10,478 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,479 [main] WARN  org.apache.hadoop.metrics2.impl.MetricsSystemImpl - JobTracker metrics system already initialized!
2021-03-07 19:05:10,480 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed!
2021-03-07 19:05:10,480 [main] INFO  org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: 

HadoopVersion   PigVersion  UserId  StartedAt   FinishedAt  Features
3.2.2   0.17.0  hadoop  2021-03-07 19:05:10 2021-03-07 19:05:10 UNKNOWN

Failed!

Failed Jobs:
JobId   Alias   Feature Message Outputs
job_local1458581109_0003    dados   MAP_ONLY    Message: Job failed!    file:///home/hadoop/hadloop/pig_output,

Input(s):
Failed to read data from "file:///mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/clientes.txt"

Output(s):
Failed to produce result in "file:///home/hadoop/hadloop/pig_output"

Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0

Job DAG:
job_local1458581109_0003


2021-03-07 19:05:10,480 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!

Services running:

(base) [hadoop@dataserver 1-HBase]$ jps
4160 SecondaryNameNode
11666 Main
5413 HQuorumPeer
5766 HRegionServer
6966 JobHistoryServer
4631 NodeManager
4457 ResourceManager
5578 HMaster
3835 DataNode
12382 Jps
3615 NameNode

Hadoop version:

SUBCOMMAND may print help when invoked w/o parameters or with -h.
(base) [hadoop@dataserver 1-HBase]$ hadoop version
Hadoop 3.2.2
Source code repository Unknown -r 7a3bc90b05f257c8ace2f76d74264906f0f7a932
Compiled by hexiaoqiao on 2021-01-03T09:26Z
Compiled with protoc 2.5.0
From source with checksum 5a8f564f46624254b27f6a33126ff4
This command was run using /opt/hadoop/share/hadoop/common/hadoop-common-3.2.2.jar

HBase version:

(base) [hadoop@dataserver 1-HBase]$ hbase version
/opt/hadoop/libexec/hadoop-functions.sh: line 2366: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_USER: bad substitution
/opt/hadoop/libexec/hadoop-functions.sh: line 2461: HADOOP_ORG.APACHE.HADOOP.HBASE.UTIL.GETJAVAPROPERTY_OPTS: bad substitution
Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/hbase/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase 2.2.0
Source code repository file:///opt/hbase-rm/output/hbase-2.2.0-bin revision=Unknown
Compiled by hbase-rm on Tue Jun 11 04:30:30 UTC 2019
From source with checksum 63a465554927aeea3f1f0bcae63decff

Pig Version:

(base) [hadoop@dataserver 1-HBase]$ pig version
2021-03-07 19:08:50,197 INFO pig.ExecTypeProvider: Trying ExecType : LOCAL
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Trying ExecType : MAPREDUCE
2021-03-07 19:08:50,199 INFO pig.ExecTypeProvider: Picked MAPREDUCE as the ExecType
2021-03-07 19:08:50,263 [main] INFO  org.apache.pig.Main - Apache Pig version 0.17.0 (r1797386) compiled Jun 02 2017, 15:41:58
2021-03-07 19:08:50,263 [main] INFO  org.apache.pig.Main - Logging error messages to: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,536 [main] ERROR org.apache.pig.Main - ERROR 2997: Encountered IOException. File version does not exist
Details at logfile: /mnt/win/GD/DS/1Formacao/3EngenhariaDeDadosHadoop/07/Arquivos/1-HBase/pig_1615154930258.log
2021-03-07 19:08:50,557 [main] INFO  org.apache.pig.Main - Pig script completed in 400 milliseconds (400 ms)


Solution

  • To solve this issue you need to start a service from Yarn called Job History Server

    Run this following command:

    mr-jobhistory-daemon.sh start historyserver
    

    and check if the following service is working fine through jps command:

    13153 HQuorumPeer
    13314 HMaster
    **20242 JobHistoryServer**
    5043 NameNode
    6003 NodeManager
    30163 Jps
    5845 ResourceManager
    5514 SecondaryNameNode
    5227 DataNode
    28510 RunJar
    13519 HRegionServer