from sklearn.tree import DecisionTreeRegressor, export_text
cols_X = ['f1', 'f2']
df_train = pd.DataFrame([[1, 3, 4], [2, 5, 1], [7, 8, 7]], columns=['f1', 'f2', 'label'])
df_test = pd.DataFrame([[2, 5], [3, 1]], columns=cols_X)
tree = DecisionTreeRegressor()
tree.fit(df_train[['f1', 'f2']], df_train['label'])
file = open(path + "myTree.txt", "w")
file.write(export_text(tree, feature_names=cols_X))
file.write("\n")
file.close()
input_tree = pd.read_csv(path + "myTree.txt") #not sure if should read as csv
A sklearn
regression tree has been trained and exported as a txt
file. Then how do I import it and apply onto the test data to make a prediction as .predict()
? Since I am totally unfamiliar with the data structure in txt file, even not sure if i should read it as 'txt'.
You are missing something important here.
The exported tree text when the function export_text
was used to create your .txt is for interpretability purposes, not for reloading the model for prediction.
It prints the decision tree rules. This is neither the model nor its parameters. It is a visual way of what the model is doing internally. You cannot load or make a model using this txt file.
Re-build your model and save it as pickle.
Example:
# training the model on training set
model_clf = KNeighborsClassifier(n_neighbors=3)
model_clf.fit(X_train, y_train)
# Saving classifier using pickle
pickle.dump(model_clf, open("model_clf_pickle", 'wb'))
# load classifier using pickle
my_model_clf = pickle.load(open("model_clf_pickle", 'rb'))
result_score = my_model_clf.score(X_test,y_test)