I'm following this installation guide but have the following problem with using graphframes
from pyspark import SparkContext
sc =SparkContext()
!pyspark --packages graphframes:graphframes:0.5.0-spark2.1-s_2.11
from graphframes import *
--------------------------------------------------------------------------- ImportError Traceback (most recent call last) in () ----> 1 from graphframes import *
ImportError: No module named graphframes
I'm not sure wether it is possible to install package on the following way. But I'll appreciate your advice and help.
Good question!
Open up your bashrc file, and type export SPARK_OPTS="--packages graphframes:graphframes:0.5.0-spark2.1-s_2.11"
. Once you saved your bashrc file, close it and type source .bashrc
.
Finally, open up your notebook and type:
from pyspark import SparkContext
sc = SparkContext()
sc.addPyFile('/home/username/spark-2.3.0-bin-hadoop2.7/jars/graphframes-0.5.0-spark2.1-s_2.11.jar')
After that, you may able to run it.