pythonazure-active-directoryazure-blob-storagechunksazure-files

Upload files and resume in azure blob storage python


Below is the code for uploading files in chunks:

azure_container = "dummy-container"


file_path = "test.txt"

chunk_size=4*1024*1024

blob_service_client = BlobServiceClient.from_connection_string(azure_connection_string)
blob_client = blob_service_client.get_blob_client(container=azure_container, blob="testingfile.txt")


test_main = []
with open(file_path, 'rb') as datax:

    #while True:
    chunk_data = datax.read(chunk_size)
    print(len(chunk_data))
    #chunk_data = [str(chunk_data, 'utf-8').split("\r")]

    # for q in chunk_data[0]:
    #     time.sleep(0.5)
    #     print(q.strip())
    #print(chunk_data)
    blob_client.upload_blob(chunk_data, overwrite=True)

I want to resume the upload if something happens in uploading, for that im using the chunks of data and recording the chunk data to continue from the left out but how to upload without overwriting ? in another word continuing to upload same file after discontinuation.


Solution

  • For this "async" and "await" is working fine. But if there is better solution then please post.

    async def uploadin_files(file_path):
        for files_to_upload in file_path:
            time.sleep(2)
            blob_client = blob_service_client.get_blob_client(container=azure_container, blob=files_to_upload)
            with open(files_to_upload, 'rb') as datax:
                chunk_data_original = datax.read()
                blob_client.upload_blob(chunk_data_original, overwrite=True)
                print(files_to_upload,"Done uploading")
    
    await uploadin_files(file_path_main)
    

    It works even after disconnecting network for sometimes and connecting back again !