This has a different answer to those given in the post above
I am getting an error that reads
pyspark.sql.utils.AnalysisException: u'Unable to infer schema for Parquet. It must be specified manually.;'
when I try to read in a parquet file like such using Spark 2.1.0
data = spark.read.parquet('/myhdfs/location/')
I have checked and the file/table is not empty by looking at the impala table through the Hue WebPortal. Also, other files that I have stored in similar directories read absolutely fine. For the record, the file names contain hyphens but no underscores or full-stops/periods.
Hence, none of the answers in the following post apply Unable to infer schema when loading Parquet file
Any ideas?
It turns out I was getting this error because there was another level to the directory structure. The following was what I needed;
data = spark.read.parquet('/myhdfs/location/anotherlevel/')