Similar to: SparklyR removing a Table from Spark Context, but different because:
The above question asks how to remove a "table" from spark, here created by the copy_to
function. If the spark_read_csv()
function is used instead it appears that there is a difference in class.
my_csv <- spark_read_csv("name", sc)
db_drop_table(my_table)
returns:
Error in UseMethod("db_drop_table") :
no applicable method for 'db_drop_table' applied to an object of class "c('tbl_spark', 'tbl_sql', 'tbl_lazy', 'tbl')"
Which indicates further that the object created here is not a table
but a tbl
, Hadleys data type of choice.
Therefore, how can I remove a specific tbl
and only that tbl
from the memory/session without exiting the full session?
Bonus: is there a button in RStudio Server interface that I've missed that will perform this process for me? I can't see on obvious way to do this in the spark connection tab.
In general sparklyr
:
memory
parameter for reader is set to TRUE
).You can remove tables from metastore using dropView
method:
sc %>% spark_session() %>% invoke("catalog") %>%
invoke("dropTempView", "my_table")
or clear cache with clearCache
method:
sc %>% spark_session() %>% invoke("catalog") %>%
invoke("clearCache")
Unless you're worried about the name clashes you should probably focus on the second one, although I'd recommend avoiding eager caching, unless it is strictly necessary.