scalaapache-sparkscala-2.13spark3apache-spark-3.0

Spark can't connect to DB with built-in connection providers


I'm trying to connect to Postgres follow this document

And the document said built-in connection providers. Can anyone help me resolve this, please? ` There is a built-in connection providers for the following databases:

val spark = SparkSession.builder().appName("get-from-postgres").master("local[*]")
      .getOrCreate()
val jdbcDF = spark.read
  .format("jdbc")
  .option("url", url)
  .option("dbtable", table)
  .option("user", username)
  .option("password", password)
  .load()
  jdbcDF.show(10)

I always get this error when running my app.

[error] java.sql.SQLException: No suitable driver
[error]         at java.sql/java.sql.DriverManager.getDriver(DriverManager.java:298)
[error]         at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.$anonfun$driverClass$2(JDBCOptions.scala:107)
[error]         at scala.Option.getOrElse(Option.scala:201)
[error]         at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:107)
[error]         at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:39)
[error]         at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
[error]         at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:350)
[error]         at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
[error]         at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
[error]         at scala.Option.getOrElse(Option.scala:201)
[error]         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
[error]         at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:171)
[error]         at udw.uni.vn.loader.TESTLoader$.delayedEndpoint$udw$uni$vn$loader$TESTLoader$1(TESTLoader.scala:51)
[error]         at udw.uni.vn.loader.TESTLoader$delayedInit$body.apply(TESTLoader.scala:12)
[error]         at scala.Function0.apply$mcV$sp(Function0.scala:39)
[error]         at scala.Function0.apply$mcV$sp$(Function0.scala:39)
[error]         at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
[error]         at scala.App.$anonfun$main$1(App.scala:76)
[error]         at scala.App.$anonfun$main$1$adapted(App.scala:76)
[error]         at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:563)
[error]         at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:561)
[error]         at scala.collection.AbstractIterable.foreach(Iterable.scala:926)
[error]         at scala.App.main(App.scala:76)
[error]         at scala.App.main$(App.scala:74)
[error]         at udw.uni.vn.loader.TESTLoader$.main(TESTLoader.scala:12)
[error]         at udw.uni.vn.loader.TESTLoader.main(TESTLoader.scala)
[error]         at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[error]         at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[error]         at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[error]         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
[error] stack trace is suppressed; run last Compile / run for the full output
[error] (Compile / run) java.sql.SQLException: No suitable driver
[error] Total time: 15 s, completed Jun 27, 2022, 4:49:01 PM

Solution

  • Finally, I found the solution for this. Only add dependencies to build.sbt for the built-in connector

    libraryDependencies ++= Seq(
        "org.apache.spark" %% "spark-core" % "3.3.0",
        "org.apache.spark" %% "spark-sql" % "3.3.0",
        "joda-time" % "joda-time" % "2.10.14",
        "com.typesafe" % "config" % "1.4.1",
        // "org.mongodb.spark" % "mongo-spark-connector" % "10.0.2",
        "com.microsoft.sqlserver" % "mssql-jdbc" % "10.2.1.jre11",
        "mysql" % "mysql-connector-java" % "8.0.29",
        "org.postgresql" % "postgresql" % "42.4.0",
        "com.oracle.database.jdbc" % "ojdbc8" % "21.6.0.0.1",
        "org.mariadb.jdbc" % "mariadb-java-client" % "3.0.5",
    )