pysparkibm-clouddashdbpixiedust

DSX PySpark writing data to dashDB with Custom JDBC dialect


In IBM Bluemix I have created a DSX PySpark notebook, with Python 2.6 and Spark 2.0. I am using IBM dashDB as my data storage. I can authenticate and read tables successfully but when I try to write back to a new table I was getting this exact same issue as described in this link .

To fix this it was suggested to register a new custom dashDB JDBC dialect using a Scala bridge with the pixiedust library, but when I reach that stage in my Notebook I keep getting the following error:

pixiedustRunner.scala:13: error: type BeanProperty is not a member of package reflect
    @scala.reflect.BeanProperty

The scala bridge code in PySpark from the 2nd link:

%%scala cl=dialect global=true
import org.apache.spark.sql.jdbc._
import org.apache.spark.sql.types.{StringType, BooleanType, DataType}

object dashDBCustomDialect extends JdbcDialect {
    override def canHandle(url: String): Boolean = url.startsWith("jdbc:db2")
    override def getJDBCType(dt: DataType): Option[JdbcType] = dt match {
            case StringType => Option(JdbcType("VARCHAR(" + maxStringColumnLength + ")", java.sql.Types.VARCHAR))
            case BooleanType => Option(JdbcType("CHAR(1)", java.sql.Types.CHAR))
            case _ => None
    }
}
JdbcDialects.registerDialect(dashDBCustomDialect)

What is the issue here?


Solution

  • This is a known issue in PixieDust due to api changes for BeanProperty which moved from scala.reflect package in Scala 2.10 to scala.beans package with Scala 2.11. A fix will be provided shortly but in the meantime, you can workaround this error by using Spark 1.6 which uses Scala 2.10.