If I'm reading/writing a dataframe in PySpark specifying HDFS name node hostname and port:
df.write.parquet("hdfs://namenode:8020/test/go", mode="overwrite")
Is there any way to debug which specific datanode(s) host/ports are returned to Spark by that namenode?
I only needed to set the Spark log level to debug.
spark.sparkContext.setLogLevel("DEBUG")