Spark reader has the function format
, which is used to specify a data source type, for example, JSON
, CSV
or third party com.databricks.spark.redshift
how can I check whether a third-party format exists or not, let me give a case
com.databricks.spark.redshift
2. io.github.spark_redshift_community.spark.redshift
, how I can determine which libs the user pastes in the classpathSystem.getProperty("java.class.path")
spark.read.format("..").load()
in try/catchI looking for a proper & reliable solution
May this answer help you.
To only check whether is spark format exists or not,
spark.read.format("..").load() in try/catch
is enough.
And as all data sources usually register themselves using DataSourceRegister
interface (and use shortName to provide their alias):
You can use Java's ServiceLoader.load
method to find all registered implementations of DataSourceRegister
interface.
import java.util.ServiceLoader
import org.apache.spark.sql.sources.DataSourceRegister
val formats = ServiceLoader.load(classOf[DataSourceRegister])
import scala.collection.JavaConverters._
formats.asScala.map(_.shortName).foreach(println)