pythonpandasdataframe

FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas


This is a sample code:

import pandas as pd

data = {
    'Id': ['id1', 'id2', 'id3', 'id4'],
    'col1': [41, 41, 41, 41],
    'col2': [6, 6, 6, 6]
}

df = pd.DataFrame(data)

df.iloc[:,1:] = df.iloc[:,1:].astype(float)
df.iloc[:,1:] = df.iloc[:,1:].div(150)

which creates the error:

Name: col1, dtype: float64' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
  df.iloc[:,1:] = df.iloc[:,1:].astype(float)

However what does work is:

for col in df.columns[1:]:
    df[col] = df[col].astype(float).div(150)

Is this the only way to solve this issue? It seems inefficient.

Using pandas 2.2.1


Solution

  • The issue is due to using a slice : in the assignment for the indices.

    df[col] will create a new Series, with a new dtype, but df.loc[:] (or, df.iloc[:, 1:]) will just change the value while keeping the original dtypes. This is the reason for your error (float64' has dtype incompatible with int64) since you shouldn't assign a float in an integer column.

    This is currently just a FutureWarning but will become an error in a future version.

    Don't use iloc and select the columns by names:

    cols = df.columns[1:]  # ['col1', 'col2']
    
    df[cols] = df[cols].astype(float).div(150)
    

    Note that you approach would work if you force the dtypes before it:

    df = df.astype({'col1': float, 'col2': float})
    df.iloc[:,1:] = df.iloc[:,1:].astype(float)
    df.iloc[:,1:] = df.iloc[:,1:].div(150)
    

    This might help if you must use iloc.

    Output:

        Id      col1      col2
    0  id1  0.001822  0.000267
    1  id2  0.001822  0.000267
    2  id3  0.001822  0.000267
    3  id4  0.001822  0.000267