azureazure-blob-storageazure-storagecloud-storageopendata

How to Minimize Egress Costs in Azure for Public Open Data Sharing


I am currently exploring a solution to host large-scale sensor data (ranging from GBs to TBs weekly) in Microsoft Azure as part of an Open Data initiative. The key objective is to store and publicly share this unstructured data efficiently, both technically and economically.

Given that Azure Blob Storage is a natural fit for this use case due to its:

However, I have concerns regarding outbound data transfer costs. While inbound data (uploading to Azure) is free, Azure charges for egress (outbound) traffic when data is downloaded by users. Since these datasets could be accessed frequently by the public, and the size ranges from TBs to potentially PBs, my assumption is that the egress costs could make this approach economically unviable.

My Questions Are:

  1. Is there a way to reduce or avoid Azure egress costs for publicly shared Open Data using Blob Storage or other Azure services? (For example: special configurations, settings, or leveraging certain Azure pricing plans.)

  2. Is it possible for clients/users downloading the data to cover the egress costs directly rather than the data owner?

  3. Are there alternative Azure services or architectures better suited to hosting public Open Data with minimal or no egress costs?

My main goal is to minimize expenses related to serving the public with large-scale datasets while adhering to the principles of Open Data.


Solution

  • Is there a way to reduce or avoid Azure egress costs for publicly shared Open Data using Blob Storage or other Azure services? (For example: special configurations, settings, or leveraging certain Azure pricing plans.)

    Is it possible for clients/users downloading the data to cover the egress costs directly rather than the data owner?

    Azure does not natively support transferring egress costs to end-users.

    You can setup users with SAS tokens for access after they cover the data costs. here is the document

    Are there alternative Azure services or architectures better suited to hosting public Open Data with minimal or no egress costs?

    Azure services, like Azure Data Share or Azure Data Lake, still incur egress charges, though they provide operational benefits.

    If you need to truly minimal costs, consider external platforms like Internet Archive or Zenodo for hosting public datasets. here is document.