pythonelasticsearchdata-science-experience

jupyter python notebook using elasticsearch


I am using elasticsearch with jupyter python notebook in DSX. When I write a dataframe to objectstorage, I get an error:

ratings_df.write.format("org.elasticsearch.spark.sql").save("swift://DSConnections.spark/ratings.es")

Py4JJavaError: An error occurred while calling o96.save. : java.lang.ClassNotFoundException: Failed to find data source: org.elasticsearch.spark.sql. Please find packages at http://spark-packages.org


Solution

  • You need to install elasticsearch connector.

    import pixiedust pixiedust.installPackage("org.elasticsearch:elasticsearch-spark_2.10:2.4.4")

    Reference for PixiedustManager

    http://datascience.ibm.com/docs/content/analyze-data/Package-Manager.html#installfrommavensearch

    Thanks, Charles.