I'm pretty new to coding so I apologize for this being stupid question. I'm writing a spark function that takes in a file path and file type and creates a dataframe. If the input is invalid, I want to just print some sort of error message and return an empty dataframe. Would I use try except?
def rdf(name, type):
try:
df=spark.read.format(type).load(name)
return df
except ____ as error:
print(error)
return "" #I want to return an empty RDD here, but I can't figure out how to make one
How do I know what goes in the ____? I tried org.apache.spark.SparkException because that's the error I get when I pass in a .csv file as a parquet and it breaks but that isn't working
You can catch multiple exceptions in the try-except block; for instance:
def rdf(name, type):
try:
df=spark.read.format(type).load(name)
return df
except (SparkException, TypeError) as error:
print(error)
return ""
You could replace or add errors to that tuple.
Using a Exception
will potentially silence errors that are unrelated to your code (like a networking issue if name is an S3 path). That is probably something you want your program to not handle.