pythonscikit-learndecision-treecross-validation

how to plot a decision tree from gridsearchcv?


i was trying to plot the decision tree which is formed with GridSearchCV, but its giving me an Attribute error.

AttributeError: 'GridSearchCV' object has no attribute 'n_features_'

However if i try to plot a normal decision tree without GridSearchCv, then it successfully prints.

code [decision tree without gridsearchcv]

# dtc_entropy : decison tree classifier based on entropy/information Gain
#plotting : decision tree on information/entropy  based

from sklearn.tree import export_graphviz
import graphviz

feature_names = x.columns

dot_data = export_graphviz(dtc_entropy, out_file=None, filled=True, rounded=True,
                                feature_names=feature_names,  
                                class_names=['0','1','2'])
graph = graphviz.Source(dot_data)  
graph                           ### --------------> WORKS 

code [decision tree with gridsearchcv]

#plotting : decision tree with GRIDSEARCHCV (dtc_gscv)  on information/entropy  based
from sklearn.tree import export_graphviz
import graphviz

feature_names = x.columns

dot_data = export_graphviz(dtc_gscv, out_file=None, filled=True, rounded=True,
                                feature_names=feature_names,  
                                class_names=['0','1','2'])
graph = graphviz.Source(dot_data)  
graph                            ##### ------------> ERROR

Error

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-201-603524707f02> in <module>()
      6 dot_data = export_graphviz(dtc_gscv, out_file=None, filled=True, rounded=True,
      7                                 feature_names=feature_names,
----> 8                                 class_names=['0','1','2'])
      9 graph = graphviz.Source(dot_data)
     10 graph

1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/tree/_export.py in export(self, decision_tree)
    393         # n_features_ in the decision_tree
    394         if self.feature_names is not None:
--> 395             if len(self.feature_names) != decision_tree.n_features_:
    396                 raise ValueError("Length of feature_names, %d "
    397                                  "does not match number of features, %d"

AttributeError: 'GridSearchCV' object has no attribute 'n_features_'

code for decision-tree based on GridSearchCV

dtc=DecisionTreeClassifier()

#use gridsearch to test all values for n_neighbors
dtc_gscv = gsc(dtc, parameter_grid, cv=5,scoring='accuracy',n_jobs=-1)

#fit model to data
dtc_gscv.fit(x_train,y_train)

One solution is taking the best parameters from gridsearchCV and then form a decision tree with those parameters and plot the tree.

However is there any way to print the decision-tree based on GridSearchCV.


Solution

  • You may try:

    dot_data = export_graphviz(dtc_gscv.best_estimator_, out_file=None, 
                filled=True, rounded=True, feature_names=feature_names, class_names=['0','1','2'])