I have trained an XGBoost Regressor model on data that has a different shape to the test data I intend to predict on. Is there a way to go around this or a model that can tolerate feature mismatches?
The input training data and test data got mismatched during One Hot Encoding of categorical features.
best_xgb = xgb.XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
colsample_bynode=1, colsample_bytree=1, enable_categorical=False,
gamma=0, gpu_id=-1, importance_type=None,
interaction_constraints='', learning_rate=0.05, max_delta_step=0,
max_depth=6, min_child_weight=10,monotone_constraints='()', n_estimators=400, n_jobs=4,
num_parallel_tree=1, predictor='auto', random_state=0, reg_alpha=0,
reg_lambda=1, scale_pos_weight=1, subsample=1, tree_method='exact',
validate_parameters=1, verbosity=None)
best_xgb.fit(X, y)
best_xgb.predict(test_data)
I get the following error: Shape Mismatch Error
Please check where 249-235=14 features are in test data.
Or fit on same data
best_xgb.fit(X[test_data.columns], y)