I have a HDFS cluster , and it has got two NameNodes. Usually if a use a HDFS client to save data, it takes care of which NameNode to use if one of these is down.
But in Spark, for checkpointing, the API is: StreamingCOntext.checkpoint("hdfs://100.90.100.11:9000/sparkData").
Here i can only specify one of the NameNode, and if that goes down , Spark has no itelligence to switch to second one.
Can anyone help me here?
Is there a way, Spark can understand the "hdfs-site.xml" (which has the information of both the namenodes) if i place this XML in the classpath.
Ok, i found the answer. You can use below syntax to add resources like core-site.xml, hdfs-site.xml etc:
SparkContext.hadoopConfiguration().addResource(ABC.class.getClassLoader().getResource("core-site.xml")); SparkContext.hadoopConfiguration().addResource(ABC.class.getClassLoader().getResource("hdfs-site.xml"));