dockermlflowmlops

Unable to connect to MLFLOW_TRACKING_URI when running MLflow run in Docker container


I have setup a mlflow server locally at http://localhost:5000

I followed the instructions at https://github.com/mlflow/mlflow/tree/master/examples/docker and tried to run the example docker with

/mlflow/examples/docker$ mlflow run . -P alpha=0.5

but I encountered the following error.

2021/05/09 17:11:20 INFO mlflow.projects.docker: === Building docker image docker-example:7530274 ===
2021/05/09 17:11:20 INFO mlflow.projects.utils: === Created directory /tmp/tmp9wpxyzd_ for downloading remote URIs passed to arguments of type 'path' ===
2021/05/09 17:11:20 INFO mlflow.projects.backend.local: === Running command 'docker run --rm -v /home/mlf/mlf/0/ae69145133bf49efac22b1d390c354f1/artifacts:/home/mlf/mlf/0/ae69145133bf49efac22b1d390c354f1/artifacts -e MLFLOW_RUN_ID=ae69145133bf49efac22b1d390c354f1 -e MLFLOW_TRACKING_URI=http://localhost:5000 -e MLFLOW_EXPERIMENT_ID=0 docker-example:7530274 python train.py --alpha 0.5 --l1-ratio 0.1' in run with ID 'ae69145133bf49efac22b1d390c354f1' === 
/opt/conda/lib/python2.7/site-packages/mlflow/__init__.py:55: DeprecationWarning: MLflow support for Python 2 is deprecated and will be dropped in a future release. At that point, existing Python 2 workflows that use MLflow will continue to work without modification, but Python 2 users will no longer get access to the latest MLflow features and bugfixes. We recommend that you upgrade to Python 3 - see https://docs.python.org/3/howto/pyporting.html for a migration guide.
  "for a migration guide.", DeprecationWarning)
Traceback (most recent call last):
  File "train.py", line 56, in <module>
    with mlflow.start_run():
  File "/opt/conda/lib/python2.7/site-packages/mlflow/tracking/fluent.py", line 122, in start_run
    active_run_obj = MlflowClient().get_run(existing_run_id)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/tracking/client.py", line 96, in get_run
    return self._tracking_client.get_run(run_id)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/tracking/_tracking_service/client.py", line 49, in get_run
    return self.store.get_run(run_id)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/store/tracking/rest_store.py", line 92, in get_run
    response_proto = self._call_endpoint(GetRun, req_body)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/store/tracking/rest_store.py", line 32, in _call_endpoint
    return call_endpoint(self.get_host_creds(), endpoint, method, json_body, response_proto)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/utils/rest_utils.py", line 133, in call_endpoint
    host_creds=host_creds, endpoint=endpoint, method=method, params=json_body)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/utils/rest_utils.py", line 70, in http_request
    url=url, headers=headers, verify=verify, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/mlflow/utils/rest_utils.py", line 51, in request_with_ratelimit_retries
    response = requests.request(**kwargs)
  File "/opt/conda/lib/python2.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 508, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/conda/lib/python2.7/site-packages/requests/sessions.py", line 618, in send
    r = adapter.send(request, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/requests/adapters.py", line 508, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=5000): Max retries exceeded with url: /api/2.0/mlflow/runs/get?run_uuid=ae69145133bf49efac22b1d390c354f1&run_id=ae69145133bf49efac22b1d390c354f1 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f5cbd80d690>: Failed to establish a new connection: [Errno 111] Connection refused',))
2021/05/09 17:11:22 ERROR mlflow.cli: === Run (ID 'ae69145133bf49efac22b1d390c354f1') failed ===

Any ideas how to fix this? I tried adding the following in MLproject file but it doesn't help

environment: [["network", "host"], ["add-host", "host.docker.internal:host-gateway"]]

Thanks for your help! =)


Solution

  • Run MLflow server such was that it will use your machine IP instead of localhost. Then point the mlflow run to that IP instead of http://localhost:5000. The main reason is that localhost of Docker process is its own, not your machine.