curl apache-spark databricks

Using curl within a Databricks+Spark notebook

I'm running a Spark cluster using Databricks. I'd like to transfer data from a server using curl. For example,

curl -H "Content-Type: application/json" -H "auth:xxxx" -X GET "https://websites.net/Automation/Offline?startTimeInclusive=201609240100&endTimeExclusive=201609240200&dataFormat=json" -k > automation.json

How does one do this within a Databricks notebook (preferably in python, but Scala is also okay)?

Solution

In Scala, you can do something like:

import sys.process._
val command = """curl -H "Content-Type: application/json" -H "auth:xxxx" -X GET "http://google.com" -k > /home/user/automation.json"""
Seq("/bin/bash", "-c", command).!!