azurite

Is there a way to automatically create a container when starting Azurite?


For test purposes I create and run an Azurite docker image, in a test pipeline. I would like to have the blob container automatically created though after Azurite is started, as it would simplify things.

Is there any good way to achieve this?

For the Postgres image we use, we can specify an init.sql which is run on startup. If something similar is available for Azurite, that would be awesome.


Solution

  • You can use the following Dockerfile to install the azure-storage-blob Python package on the Alpine based azurite image. The resulting image size is ~400MB compared to the ~1.2GB azure-cli image.

    ARG AZURITE_VERSION="3.17.0"
    FROM mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION}
    
    # Install azure-storage-blob python package
    RUN apk update && \
        apk --no-cache add py3-pip && \
        apk add --virtual=build gcc libffi-dev musl-dev python3-dev && \
        pip3 install --upgrade pip && \
        pip3 install azure-storage-blob==12.12.0
    
    # Copy init_azurite.py script
    COPY ./init_azurite.py init_azurite.py
    
    # Copy local blobs to azurite
    COPY ./init_containers init_containers
    
    # Run the blob emulator and initialize the blob containers
    CMD python3 init_azurite.py --directory=init_containers & \
        azurite-blob --blobHost 0.0.0.0 --blobPort 10000
    

    The init_azurite.py script is a local Python script that uses the azure-storage-blob package to batch upload files and directories to the azurite blob storage emulator.

    import argparse
    import os
    from time import sleep
    
    from azure.core.exceptions import ResourceExistsError
    from azure.storage.blob import BlobServiceClient, ContainerClient
    
    
    def upload_file(container_client: ContainerClient, source: str, dest: str) -> None:
        """
        Upload a single file to a path inside the container.
        """
        print(f"Uploading {source} to {dest}")
        with open(source, "rb") as data:
            try:
                container_client.upload_blob(name=dest, data=data)
            except ResourceExistsError:
                pass
    
    
    def upload_dir(container_client: ContainerClient, source: str, dest: str) -> None:
        """
        Upload a directory to a path inside the container.
        """
        prefix = "" if dest == "" else dest + "/"
        prefix += os.path.basename(source) + "/"
        for root, dirs, files in os.walk(source):
            for name in files:
                dir_part = os.path.relpath(root, source)
                dir_part = "" if dir_part == "." else dir_part + "/"
                file_path = os.path.join(root, name)
                blob_path = prefix + dir_part + name
                upload_file(container_client, file_path, blob_path)
    
    def init_containers(
        service_client: BlobServiceClient, containers_directory: str
    ) -> None:
        """
        Iterate on the containers directory and do the following:
        1- create the container.
        2- upload all folders and files to the container.
        """
        for container_name in os.listdir(containers_directory):
            container_path = os.path.join(containers_directory, container_name)
            if os.path.isdir(container_path):
                container_client = service_client.get_container_client(container_name)
                try:
                    container_client.create_container()
                except ResourceExistsError:
                    pass
                for blob in os.listdir(container_path):
                    blob_path = os.path.join(container_path, blob)
                    if os.path.isdir(blob_path):
                        upload_dir(container_client, blob_path, "")
                    else:
                        upload_file(container_client, blob_path, blob)
    
    
    if __name__ == "__main__":
        parser = argparse.ArgumentParser(
            description="Initialize azurite emulator containers."
        )
        parser.add_argument(
            "--directory",
            required=True,
            help="""
            Directory that contains subdirectories named after the 
            containers that we should create. Each subdirectory will contain the files
             and directories of its container.
            """
        )
    
        args = parser.parse_args()
    
        # Connect to the localhost emulator (after 5 secs to make sure it's up).
        sleep(5)
        blob_service_client = BlobServiceClient(
            account_url="http://localhost:10000/devstoreaccount1",
            credential={
                "account_name": "devstoreaccount1",
                "account_key": (
                    "Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq"
                    "/K1SZFPTOtr/KBHBeksoGMGw=="
                )
            }
        )
    
        # Only initialize if not already initialized.
        if next(blob_service_client.list_containers(), None):
            print("Emulator already has containers, will skip initialization.")
        else:
            init_containers(blob_service_client, args.directory)
    

    This script will be copied to the azurite container and will populate the initial blob containers every time the azurite container is started unless some containers were already persisted using docker volumes. In that case, nothing will happen.

    Following is an example docker-compose.yml file:

    azurite:
      build:
        context: ./
        dockerfile: Dockerfile
        args:
          AZURITE_VERSION: 3.17.0
      restart: on-failure
      ports:
        - 10000:10000
      volumes:
        - azurite-data:/opt/azurite
    
    volumes:
      azurite-data:
    

    Using such volumes will persist the emulator data until you destroy them (e.g. by using docker-compose down -v).

    Finally, init_containers is a local directory that contains the containers and their folders/files. It will be copied to the azurite container when the image is built.

    For example:

    init_containers:
       container-name-1:
         dir-1:
           file.txt
           img.png
         dir-2:
           file.txt
       container-name-2:
         dir-1:
           file.txt
         img.png