pythonmachine-learningscikit-learnstandardization

How to find out StandardScaling parameters .mean_ and .scale_ when using Column Transformer from Scikit-learn?


I want to apply StandardScaler only to the numerical parts of my dataset using the function sklearn.compose.ColumnTransformer, (the rest is already one-hot encoded). I would like to see .scale_ and .mean_ parameters fitted to the training data, but the function scaler.mean_ and scaler.scale_ obviously does not work when using a column transformer. Is there a way to do so?

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

scaler = StandardScaler()
data_pipeline = ColumnTransformer([
 ('numerical', scaler, numerical_variables)], remainder='passthrough')

X_train = data_pipeline.fit_transform(X_train)

Solution

  • The fitted transformers are available in the attributes transformers_ (a list) and named_transformers_ (a dict-like with keys the names you provided). So, for example,

    data_pipeline.named_transformers_['numerical'].mean_