I have a dataframe in pandas where each column has different value range. For example:
df:
A B C
1000 10 0.5
765 5 0.35
800 7 0.09
Any idea how I can normalize the columns of this dataframe where each value is between 0 and 1?
My desired output is:
A B C
1 1 1
0.765 0.5 0.7
0.8 0.7 0.18(which is 0.09/0.5)
You can use the package sklearn and its associated preprocessing utilities to normalize the data.
import pandas as pd
from sklearn import preprocessing
x = df.values #returns a numpy array
min_max_scaler = preprocessing.MinMaxScaler()
x_scaled = min_max_scaler.fit_transform(x)
df = pd.DataFrame(x_scaled)
For more information look at the scikit-learn documentation on preprocessing data: scaling features to a range.