pythongoogle-cloud-platformgoogle-cloud-storage

Limit download rate from google cloud storage (python library)


I would like to be able to limit the rate of a blob download from google cloud storage in Python.

I could not find any indication that is possible using the official Python library or the alternative GCSFS library.

My best guess so far would be to implement it by downloading slices of the blob using download_as_bytes() start and end arguments and control for timing between slice requests, but 1) I would prefer if possible a built-in solution and 2) I am not sure this would be the best solution.

Does anybody have a built-in solution or a better approach?


Solution

  • To limit the download rate of a blob in Google Cloud Storage using Python, there's no built-in solution. You can manually download the file in chunks and control the timing between downloads.

    Here's a simple example:

    import time
    from google.cloud import storage
    
    def download_blob_rate_limited(bucket_name, blob_name, dest_file, chunk_size=1024*1024, rate_limit=512*1024):
        client = storage.Client()
        blob = client.bucket(bucket_name).blob(blob_name)
        
        with open(dest_file, 'wb') as file_obj:
            start = 0
            blob_size = blob.size
            while start < blob_size:
                end = min(start + chunk_size, blob_size)
                chunk = blob.download_as_bytes(start=start, end=end - 1)
                file_obj.write(chunk)
                time.sleep(chunk_size / rate_limit)
                start = end
    
    # Usage example
    download_blob_rate_limited('my-bucket', 'my-blob', 'local_file.txt', rate_limit=512*1024)
    

    This downloads the file in chunks and limits the download rate using time.sleep().