apache-sparkpysparkapache-spark-sqlapache-zeppelin

How do I pass parameters to spark.sql(""" """)?


I'd like to pass a string to spark.sql

Here is my query

mydf = spark.sql("SELECT * FROM MYTABLE WHERE TIMESTAMP BETWEEN '2020-04-01' AND '2020-04-08') 

I'd like to pass a string for the date.

I tried this code

val = '2020-04-08'

s"spark.sql("SELECT * FROM MYTABLE WHERE TIMESTAMP  BETWEEN $val  AND '2020-04-08'

Solution

  • Try with Python string formatting {} and .format(val) as $val is in scala.

    val = '2020-04-08'
    
    spark.sql("SELECT * FROM MYTABLE WHERE TIMESTAMP  BETWEEN {}  AND '2020-04-08'".format(val)).show()
    

    Example:

    In Pyspark:

    spark.sql("select * from tmp").show()
    #+----+---+
    #|name| id|
    #+----+---+
    #|   a|  1|
    #|   b|  2|
    #+----+---+
    
    id='1'
    
    spark.sql("select * from tmp where id={}".format(id)).show()
    #+----+---+
    #|name| id|
    #+----+---+
    #|   a|  1|
    #+----+---+
    

    In Scala:

    Use string interpolation to substitute the values of variable

    val id=1
    spark.sql(s"select * from tmp where id=$id").show()
    //+----+---+
    //|name| id|
    //+----+---+
    //|   a|  1|
    //+----+---+