pythonpandasdataframecasting

Python - pandas column type casting with "astype" is not working


Here are the top 5 rows of the DataFrame (poorly formatted but you can see that most of these values are convertable to numbers)

df.head()
ID  Overall Acceleration    Aggression  Agility Balance Ball control    Composure   Crossing    Curve   Dribbling   Finishing   Free kick accuracy  GK diving   GK handling GK kicking  GK positioning  GK reflexes Heading accuracy    Interceptions   Jumping Long passing    Long shots  Marking Penalties   Positioning Reactions   Short passing   Shot power  Sliding tackle  Sprint speed    Stamina Standing tackle Strength    Vision  Volleys
0   20801   94  89  63  89  63  93  95  85  81  91  94  76  7   11  15  14  11  88  29  95  77  92  22  85  95  96  83  94  23  91  92  31  80  85  88
1   158023  93  92  48  90  95  95  96  77  89  97  95  90  6   11  15  14  8   71  22  68  87  88  13  74  93  95  88  85  26  87  73  28  59  90  85
2   190871  92  94  56  96  82  95  92  75  81  96  89  84  9   9   15  15  11  62  36  61  75  77  21  81  90  88  81  80  33  90  78  24  53  80  83
3   176580  92  88  78  86  60  91  83  77  86  86  94  84  27  25  31  33  37  77  41  69  64  86  30  85  92  93  83  87  38  77  89  45  80  84  88
4   167495  92  58  29  52  35  48  70  15  14  30  13  11  91  90  95  91  89  25  30  78  59  16  10  47  12  85  55  25  11  61  44  10  83  70  11

Here is a description of all of the types:

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 18085 entries, 0 to 18084
Data columns (total 36 columns):
ID                    18085 non-null int64
Overall               18085 non-null int64
Acceleration          18085 non-null object
Aggression            18085 non-null object
Agility               18085 non-null object
Balance               18085 non-null object
Ball control          18085 non-null object
Composure             18085 non-null object
Crossing              18085 non-null object
Curve                 18085 non-null object
Dribbling             18085 non-null object
Finishing             18085 non-null object
Free kick accuracy    18085 non-null object
...
dtypes: int64(2), object(34)
memory usage: 5.1+ MB

Here is my attempt to convert the object types to floats.

for column in full:
    tmp = pd.Series(column)
    column = tmp.astype("float64", errors="ignore")

And afterwards all of the relevant types are still "object."

df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 18085 entries, 0 to 18084
Data columns (total 36 columns):
ID                    18085 non-null int64
Overall               18085 non-null int64
Acceleration          18085 non-null object
Aggression            18085 non-null object
Agility               18085 non-null object
Balance               18085 non-null object
Ball control          18085 non-null object
Composure             18085 non-null object
Crossing              18085 non-null object
Curve                 18085 non-null object
Dribbling             18085 non-null object
Finishing             18085 non-null object
Free kick accuracy    18085 non-null object
...
dtypes: int64(2), object(34)
memory usage: 5.1+ MB

Can anybody see what I'm doing wrong? I've tried many different approaches from this site and others but I can't understand why the types aren't being changed. Any help is appreciated. Thank you.

Edit: I am doing this in a Kaggle.com IPython notebook if that could have something to do with this.


Solution

  • Migrating solution from comments to answers. Thanks to @Wen.

    df=df.apply(pd.to_numeric, errors='coerce')