pythonminiodvc

Installation DVC on MinIO storage


Does anybody install DVC on MinIO storage?

I have read docs but not all clear for me.

Which command should I use for setup MinIO storage with this entrance parameters:

storage url: https://minio.mysite.com/minio/bucket-name/ login: my_login password: my_password


Solution

  • Install

    I usually use it as a Python package, in this case you need to install:

    pip install "dvc[s3]"
    

    Setup remote

    By default DVC supports AWS S3 storage and it works fine.
    It also supports "S3-compatible storage", and MinIO in particular. In this case you have a bucket - a directory on a MinIO server where actual data is stored (it is similar to an AWS bucket). DVC uses AWS CLI to authenticate with AWS and in case of MinIO you need to pass credentials to dvc (not to the minio package).

    The commands to setup MinIO as your DVC remote:

    # setup default remote (change "bucket-name" to your minio backet name)
    dvc remote add -d minio s3://bucket-name -f
    
    # add information about storage url (where "https://minio.mysite.com" is your MinIO url)
    dvc remote modify minio endpointurl https://minio.mysite.com
    
    #  add MinIO credentials (e.g. from env. variables)
    dvc remote modify minio access_key_id my_login
    dvc remote modify minio secret_access_key my_password
    

    If you move from old remote, use the following commands to move your data:

    Before setup (download old remote cache to the local machine - note it may take a long time):

    dvc pull -r <old_remote_name> --all-commits --all-tags --all-branches
    

    After setup (upload all local cache data to a new remote):

    dvc push -r <new_remote_name> --all-commits --all-tags --all-branches