I am using mlflow server
to set up mlflow tracking server. mlflow server
has 2 command options that accept artifact URI, --default-artifact-root <URI>
and --artifacts-destination <URI>
.
From my understanding, --artifacts-destination
is used when the tracking server is serving the artifacts.
Based on the scenarios 4 and 5 provided by MLflow Tracking documentation
mlflow server --backend-store-uri postgresql://user:password@postgres:5432/mlflowdb --default-artifact-root s3://bucket_name --host remote_host --no-serve-artifacts
mlflow server \
--backend-store-uri postgresql://user:password@postgres:5432/mlflowdb \
# Artifact access is enabled through the proxy URI 'mlflow-artifacts:/',
# giving users access to this location without having to manage credentials
# or permissions.
--artifacts-destination s3://bucket_name \
--host remote_host
In the 2 scenarios, both --default-artifact-root
and --artifacts-destination
accept a s3 bucket URI, s3://bucket_name
as the argument. I fail to see why we need 2 separate command options for setting artifact URI.
Their descriptions are
--default-artifact-root <URI>
Directory in which to store artifacts for any new experiments created. For tracking server backends that rely on SQL, this option is required in order to store artifacts. Note that this flag does not impact already-created experiments with any previous configuration of an MLflow server instance. By default, data will be logged to the mlflow-artifacts:/ uri proxy if the –serve-artifacts option is enabled. Otherwise, the default location will be ./mlruns.
--artifacts-destination <URI>
The base artifact location from which to resolve artifact upload/download/list requests (e.g. ‘s3://my-bucket’). Defaults to a local ‘./mlartifacts’ directory. This option only applies when the tracking server is configured to stream artifacts and the experiment’s artifact root location is http or mlflow-artifacts URI.
What is the reason of having the 2 command options? What happen if both are specified, will one URI precede the other?
At first, it looks confusing because you have high flexibility.
You can use both of them or only one of them. Let's explain it a bit more :-)
--default-artifact-root
is a directory for storing artifacts for every new experiment.
-serve-artifacts
is enabled or not
(mlflow-artifacts:/
, ./mlruns
)--artifacts-destination
is used to specify the location of artifacts in HTTP requests.
--serve-artifacts
is enabled) AND the experiment’s artifact root location is http
or mlflow-artifacts URI
Case 1: Use both --default-artifact-root
& --artifacts-destination
:
mlflow server
--default-artifact-root mlflow-artifacts:/
--artifacts-destination s3://my-root-bucket
--host remote_host
--serve-artifacts
Case 2: Use only --artifacts-destination
mlflow server
--artifacts-destination s3://my-root-bucket
--host remote_host
--serve-artifacts
Case 3: Use only --default-artifact-root
mlflow server
--default-artifact-root is s3://my-root-bucket/mlartifacts
--serve-artifacts
In this case the server can resolve all the following patterns to the configured proxied object store location of s3://my-root-bucket/mlartifacts
:
https://<host>:<port>/mlartifacts
http://<host>/mlartifacts
mlflow-artifacts://<host>/mlartifacts
mlflow-artifacts://<host>:<port>/mlartifacts
mlflow-artifacts:/mlartifacts