azure-databricksazure-data-lake-gen2shareplum

Using shareplum to copy file to adls


I am trying to copy files (csv, image, ppt, docx, etc) from sharepoint into ADLS using shareplum, but I am having trouble on how to save the file.

import requests from shareplum 
import Office365 from shareplum 
import Site from shareplum.site 
import Version

fileName = "sharepoint file name"
path = "adls path"

basePath = 'sharepoint base path' 
sitePath = "sharepoint site path

authcookie = Office365(basePath, username=username, password=password).GetCookies() 
site = Site(basePath + "/sites/" + sitePath, version=Version.v365, authcookie=authcookie)

folder = site.Folder(folderPath) 
file = folder.get_file(fileName)

This code works to get the file as a Byte Object, but I am not sure how to save it into ADLS. I tried using some of the dbutils functions to save but I could not get them to work with Byte Object.


Solution

  • After you getting byte object you can not directly use that to save in storage account. So, save the data in temporary file and then copy that file to storage account.

    Use below code.

    with open("/dbfs/temp_file.csv", "wb") as output_file:
        output_file.write(file_content.getvalue())
    

    And copy to storage account.

    dbutils.fs.cp("temp_file.csv",storage_account_path)

    Make sure you configured the credentials for accessing storage account.

    enter image description here

    and

    enter image description here