amazon-web-servicesazureamazon-s3cloudblob

How to copy blobs from azure to aws using python?


For example im copying this specific file "container/dir/test1.txt" in vs code, but im unable to find a solution which does not require to download the file locally or in chunks... I tried azure sas url but that too streams data in chunks ... so occupying memory

sas_url = generate_sas_url(blob_client, account_name, account_key)

or

s3_object = s3client.get_object(Bucket=aws_bucket_name, Key=blob_name)

file_content = s3_object['Body'].read()


Solution

  • How to copy blobs from azure to aws using python?

    To copy blobs from Azure to Amazon Wed Service (AWS), I would suggest to use the Azure data factory to copy data directly from Azure Blob Storage to AWS S3 for more efficient.

    You can use the below Python code which will copy the file from azure Blob Storage to AWS.

    Code:

    import os
    from azure.storage.blob import BlobServiceClient
    import boto3
    
    # Azure Blob Storage credentials
    azure_account_name = 'venkat32612'
    azure_account_key = 'xxxxx'
    azure_container_name = 'test'
    azure_blob_name = "sample.txt'
    
    #AWS S3 credentials
    aws_access_key_id = '<your_aws_access_key_id>'
    aws_secret_access_key = '<your_aws_secret_access_key>'
    aws_bucket_name = '<your_aws_bucket_name>'
    aws_object_key = '<your_aws_object_key>'
    
    blob_service_client = BlobServiceClient(account_url=f"https://{azure_account_name}.blob.core.windows.net", credential=azure_account_key)
    blob_client = blob_service_client.get_blob_client(container=azure_container_name, blob=azure_blob_name)
    
    stream = blob_client.download_blob().content_as_bytes()
    
    s3_client = boto3.client('s3', aws_access_key_id=aws_access_key_id, aws_secret_access_key=aws_secret_access_key)
    s3_client.put_object(Bucket=aws_bucket_name, Key=aws_object_key, Body=stream)
    

    The above code users Azure Blob Storage client library for Python to download the blob from Azure to a stream, and then use the AWS SDK for Python (Boto3) to upload the stream to AWS S3. This way, you can avoid downloading the file locally or in chunks. But the above code will work smaller files easier.

    Reference:

    File Transfer from Azure BLOB to AWS S3 : Step-by-Step Guide | by Sarath Chandran | Litmus7 Systems Consulting | Medium by Sarath Chandran.