As I am a beginner in the field of Big data, installed Hadoop 3.3.1 and Hive and uploaded data on Hive. I have some questions and confusions:
I have not configured hadoop.tmp.directory in core-site.xml but configured datanode and namenode dir paths in hdfs-site.xml are configured. Will this affect my mapreduce working if I won't configure for tmp directory in core-site.xml?
As I have clustered slaves and masters and also installed hive and uploaded data, and now if I change configuration(reconfigure like tmp directory path in core-site.xml) in any of the clustered master or any one of the slave so should I do hdfs namenode -format after every reconfiguration, will I lost hive and uploaded data by formatting namenode?
Here is my hdfs-site.xml and core-site.xml configurations. Kindly tell me are these configurations are correct or not.
core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml in master:
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
hdfs-site.xml in slaves:
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/home/hdoop/dfsdata/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
hadoop.tmp.dir
defaults to /tmp/hadoop-${user.name}
. There's no particular reason to change it
Yes, reformatting the namenode removes all HDFS data, but it won't truncate your Hive metastore