Am trying to run a decision tree based model. I tried the below:
X = df[['Quantity']]
y = df[['label']]
params = {'max_depth':[2,3,4], 'min_samples_split':[2,3,5,10]}
clf_dt = DecisionTreeClassifier()
clf = GridSearchCV(clf_dt, param_grid=params, scoring='f1')
clf.fit(X, y)
clf_dt = DecisionTreeClassifier(clf.best_params_)
And got the warning mentioned here
FutureWarning: Pass criterion={'max_depth': 2, 'min_samples_split': 2} as keyword args. From version 1.0 (renaming of 0.25) passing these as positional arguments will result in an error
warnings.warn(f"Pass {args_msg} as keyword args. From version "
Later, I tried running the below and got an error (but I already fit the model using .fit()
)
from sklearn import tree
tree.plot_tree(clf_dt, filled=True, feature_names = list(X.columns), class_names=['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'])
NotFittedError: This DecisionTreeClassifier instance is not fitted yet. Call
'fit' with appropriate arguments before using this estimator.
How can I fix this?
If you go with best_params_
, you'll have to refit the model with those parameters. Note that these should be unpacked when passed to the model:
clf_dt = DecisionTreeClassifier(**clf.best_params_)
clf_dt.fit(X, y)
However, you can also use the best_estimator_
attribute in order to access the best model directly:
clf_dt = clf.best_estimator_