pythonpandasstringnumeric

using pandas to number and coerce to force values to ints and still not working


Confused when I am trying to coerce dataframe to numeric. It appears to work when I look at structure but then I still get errors:

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Code:

df = df_leads.apply(pd.to_numeric, errors='coerce') code here

df.info()

Returns: Columns: 133 entries, org_size_1_99 to engagement_Type_webpage visits dtypes: float64(107), int64(26) memory usage: 3.1 MB

next line of code:

sum(df['target']).astype(int)

returns: TypeError: unsupported operand type(s) for +: 'int' and 'str'


Solution

  • There are values in your data that cannot be treated as numeric, because they aren't numbers. You can see in the error itself that there are still strings present. That means you cannot blanket a statement like that without excluding, or ignoring them.

    This is a more stable approach:

    df = df_leads.apply(lambda x: pd.to_numeric(x, errors='coerce'))
    
    df.info()
    
    nan_count = df['target'].isnull().sum()
    print(f"Number of NaN values in 'target': {nan_count}")
    
    df['target'] = df['target'].fillna(0)
    
    sum_result = df['target'].sum().astype(int)
    print(f"Sum of 'target': {sum_result}")