I am trying to load data from hive using spark can able to read the data recursively under the directory of dt=2022-10-11, however not able to read from -ext-10000..It is also not showing any error
hadoop fs -ls /user/warehouse/dbA/tableA/dt=2022-10-11/
hadoop fs -ls /user/warehouse/dbA/tableA/dt=2022-10-12/-ext-10000
I have used all the below spark settings to read data from HDFS, using spark 2.3 version:
--conf hive.exec.dynamic.partition=true
--conf hive.exec.dynamic.partition.mode=nonstrict
--conf mapreduce.input.fileinputformat.input.dir.recursive=true
--conf spark.hive.mapred.supports.subdirectories=true
--conf spark.hadoop.hive.supports.subdirectories=true
--conf spark.hadoop.hive.mapred.supports.subdirectories=true
--conf spark.hadoop.hive.input.dir.recursive=true
--conf spark.hadoop.mapreduce.input.fileinputformat.input.dir.recursive=true
--conf hive.exec.compress.output=true
I have added below config and now spark is able to read multiple files under sub directories --conf spark.sql.hive.convertMetastoreOrc=false