hadoophadoop-yarn

Hadoop 2.6 Mapreduce permissions incorrectly set on Windows


I have installed Hadoop 2.6 on Windows as a test bed for some software that is dependent on Hadoop. The install worked correctly as far as I can tell. I have Hadoop saved in C:\Hadoop and my temporary folder in C:\hadooptemp. I followed this tutorial on getting it set up: https://drive.google.com/file/d/0BweVwq32koypYm1QWHNvRTZWTm8/view

When I run the hadoop-mapreduce-examples-2.6.0.jar for pi, as provided in the tutorial, I get the following output:

Number of Maps  = 2
Samples per Map = 5
Wrote input for Map #0
Wrote input for Map #1
Starting Job
15/08/27 15:55:10 INFO client.RMProxy: Connecting to ResourceManager at /155.41.90.116:8032
15/08/27 15:55:12 INFO input.FileInputFormat: Total input paths to process : 2
15/08/27 15:55:12 INFO mapreduce.JobSubmitter: number of splits:2
15/08/27 15:55:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1440705227041_0001
15/08/27 15:55:14 INFO impl.YarnClientImpl: Submitted application application_1440705227041_0001
15/08/27 15:55:14 INFO mapreduce.Job: The url to track the job: http://mycomp:8088/proxy/application_1440705227041_0001/
15/08/27 15:55:14 INFO mapreduce.Job: Running job: job_1440705227041_0001
15/08/27 15:55:35 INFO mapreduce.Job: Job job_1440705227041_0001 running in uber mode : false
15/08/27 15:55:35 INFO mapreduce.Job:  map 0% reduce 0%
15/08/27 15:55:35 INFO mapreduce.Job: Job job_1440705227041_0001 failed with state FAILED due to: Application application_1440705227041_0001 failed 2 times due to AM Container for appattempt_1440705227041_0001_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://mycomp:8088/proxy/application_1440705227041_0001/Then, click on links to logs of each attempt.

Diagnostics: Failed to setup local dir /hadooptemp/nm-local-dir, which was marked as good.

Failing this attempt. Failing the application.
15/08/27 15:55:35 INFO mapreduce.Job: Counters: 0
Job Finished in 25.444 seconds
java.io.FileNotFoundException: File does not exist: hdfs://155.41.90.116:8020/user/me/QuasiMonteCarlo_1440705304456_1878814183/out/reduce-out
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
        at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1130)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1751)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1774)
        at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314)
        at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
        at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

From what I have been tracing with the exitCode: -1000, to comes down to Hadoop not being able to setup the local directory with appropriate permissions. I believe in the tutorial I am using above, that is why they disable the User Account Controls. Whether or not I do this, I get the same error.

I also found a similar issue in this link: Mapreduce error: Failed to setup local dir

I tried to follow as they said, and make both C:\Hadoop and C:\hadooptemp owned by my user account through the folder properties, security, and advanced settings. I was listed as the owner before, and I have full control access to the folder according to this. Either that isn't the issue, or I have incorrectly assigned the ownership to my account.

Finally, in my YARN Node Manager, I get the following error that seems to possibly be related which pops up in a few places:

15/08/27 15:55:34 WARN localizer.ResourceLocalizationService: Permissions incorrectly set for dir /hadooptemp/nm-local-dir/usercache, should be rwxr-xr-x, actual value = rwxrwxr-x

It seems that I have too many permissions, as it states they are incorrectly set? I can't imagine this is the cause of the issue, that my group also has write permissions, but I couldn't figure out how to alter that in Windows.

Any help on figuring out the details of the permissions problem to fix this error would be appreciated.


Solution

  • In my case was due to Windows Domain not reachable. Connect your pc to windows Domain. Here my yarn config

    <configuration>
    
    <!-- Site specific YARN configuration properties -->
         <property>
            <name>yarn.resourcemanager.hostname</name>
            <value>0.0.0.0</value>
        </property>
    <property>
            <name>yarn.nodemanager.local-dirs</name>
            <value>c:\my\hadoop-2.7.1\tmp-nm</value>
    </property>
    
        <property>
            <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
            <value>98.5</value>
        </property>
    </configuration>
    

    also see https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/SecureContainer.html