hadoophivehdfshadoop3

Is it necessary to configure hadoop.tmp.directory in core-site.xml in hadoop-3.3.1?


As I am a beginner in the field of Big data, installed Hadoop 3.3.1 and Hive and uploaded data on Hive. I have some questions and confusions:

  1. I have not configured hadoop.tmp.directory in core-site.xml but configured datanode and namenode dir paths in hdfs-site.xml are configured. Will this affect my mapreduce working if I won't configure for tmp directory in core-site.xml?

  2. As I have clustered slaves and masters and also installed hive and uploaded data, and now if I change configuration(reconfigure like tmp directory path in core-site.xml) in any of the clustered master or any one of the slave so should I do hdfs namenode -format after every reconfiguration, will I lost hive and uploaded data by formatting namenode?

  3. Here is my hdfs-site.xml and core-site.xml configurations. Kindly tell me are these configurations are correct or not.

core-site.xml:

<configuration>
<property>
   <name>fs.default.name</name>
   <value>hdfs://hadoop-master:9000</value>
</property>
</configuration>

hdfs-site.xml in master:

<configuration>
<property>
    <name>dfs.data.dir</name>
    <value>/home/hdoop/dfsdata/namenode</value>
</property>
<property>
    <name>dfs.data.dir</name>
    <value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
    <name>dfs.replication</name>
    <value>2</value>
</property>
</configuration>

hdfs-site.xml in slaves:

<configuration>
<property>
    <name>dfs.data.dir</name>
    <value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
    <name>dfs.replication</name>
    <value>2</value>
</property>
</configuration>

Solution

  • hadoop.tmp.dir defaults to /tmp/hadoop-${user.name}. There's no particular reason to change it

    Yes, reformatting the namenode removes all HDFS data, but it won't truncate your Hive metastore