pythondaskauto-sklearn

I get "UserWarning: Port 8787 is already in use" when use AutoSklearn, Why would AutoSklearn use ports?


Here is my code

automl = autosklearn.classification.AutoSklearnClassifier(
include={'feature_preprocessor': ["no_preprocessing"], 
 },
exclude={ 'classifier': ['random_forest']},
time_left_for_this_task=60*10,
per_run_time_limit=60*1,
memory_limit = 1024 * 10,
n_jobs=-1,
metric=autosklearn.metrics.f1_macro,
        )


clf = OneVsRestClassifier(automl, n_jobs=-1)

clf.fit(X_train, y_train)

when I try to fit I get this error

/home/user/.local/lib/python3.8/site-packages/distributed/node.py:180: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42433 instead
  warnings.warn(
Killed

why AutoSKlearn asking for dask and how to fix this error??


Solution

  • Auto-sklearn uses Dask for parallel optimization, which is controlled by the n_jobs argument, as explained in more detail here. The warning message you're seeing occurs when you're starting a new Dask cluster when there is already one in use. One option to remove the warning is to guard the code, as shown in the example in the above link, by placing your code within if __name__ == '__main__':.