multisampling

convert numpy.ndarry to DataFrame maintaining indices after upsampling


X_train_1, X_test_1, y_train_1, y_test = train_test_split(x, y,
                                              test_size = .3)

X_train_sam, y_train_sam = ADASYN(random_state=42).fit_sample(X_train_1, y_train_1)

type(X_train_1)
pandas.core.frame.DataFrame

X_train_1.shape
(1668, 353)

type(X_train_sam)
numpy.ndarray

X_train_sam.shape
(2698, 353)

How can I convert X_train_sam back to the dataframe, so that it is the same as X_train_1 and maintain indices while adding indices to the new data ?


Solution

  • Something like this:

    result = pd.DataFrame(X_train_sam)
    result.columns = train_1.columns