pythonpandasdataframenan

Data Frame Column Copying resulting in Nan


I'm trying to copy data over from one Panadas DataFrame into another and I'm getting some strange results. For example if I have:

[In]:
A = {'Types':['Falcon', 'Eagle', 'sparrow'], 
     'Speed':[100, 75, 50]}
df_A = pd.DataFrame(A)

df_B = pd.DataFrame()
df_B['Type'] = df_A['Types']
df_B['tags'] = ['FLCN', 'EGLE', 'SPRW']
df_B['ID'] = [543.76, 534.32, 645.25]

df_A['Tags'] = df_B['tags']
df_A['ID'] = df_B['ID']
df_A

What I'm expecting to get is:

[Out]:
    Types   Speed   Tags    ID
0   Falcon  100     FLCN    543.76
1   Eagle   75      EGLE    534.32
2   sparrow 50      SPRW    645.25

But what I'm getting instead is:

[Out]:
    Types   Speed   Tags    ID
0   Falcon  100     FLCN    NaN
1   Eagle   75      EGLE    NaN
2   sparrow 50      SPRW    NaN

I've tried doing this in a Jupyter Notebook to trouble shoot and received a TypeError: "'Method' object is not subscriptable". Here is an example of the type error that I received:

ex. 2:

[In]:
df_A['ID'] = df_B['ID']

[Out]:
TypeError: 'method' object is not subscriptable

Once I decided to write a question I wrote the code for these examples in Jupyter and got the expected results without any issue, so I'm stumped.

Edit to Add: I've tried using the following as a work around:

[In]:
df_A['Tags'] = df_B['tags']
df_A = pd.concat(df_A, df_B['ID'], axis = 1)

but I'm still getting funky results. With this sample code I end up with:

[Out]:
    Types   Speed   Tags    ID
0   Falcon  100     FLCN    543.76
1   Eagle   75      EGLE    534.32
2   sparrow 50      SPRW    645.25

but when I use my larger data set, the results look like this:

[Out]:
    Types   Speed   Tags    ID
0   NaN     NaN     NaN     543.76
1   NaN     NaN     NaN     534.32
2   NaN     NaN     NaN     645.25
3   Falcon  100     FLCN    NaN
4   Eagle   75      EGLE    NaN
5   sparrow 50      SPRW    NaN

despite using 'axis=1' as a parameter in pd.concat.


Solution

  • At each step, with the provided code, I get no NaN neither.

    import pandas as pd
    
    
    def print_df_A_df_B(stage, df_A, df_B):
        print(stage, "assignment","\ndf_A\n",df_A,"\ndf_B\n",df_B,"\n")
        pass
    
    A = {'Types':['Falcon', 'Eagle', 'sparrow'], 
         'Speed':[100, 75, 50]}
    df_A = pd.DataFrame(A)
    
    df_B = pd.DataFrame()
    df_B['Type'] = df_A['Types']
    print_df_A_df_B("Type", df_A, df_B)
    df_B['tags'] = ['FLCN', 'EGLE', 'SPRW']
    df_B['ID'] = [543.76, 534.32, 645.25]
    
    df_A['Tags'] = df_B['tags']
    print_df_A_df_B("Tags", df_A, df_B)
    df_A['ID'] = df_B['ID']
    print_df_A_df_B("ID", df_A, df_B)
    

    Out

    Type assignment 
    df_A
          Types  Speed
    0   Falcon    100
    1    Eagle     75
    2  sparrow     50 
    df_B
           Type
    0   Falcon
    1    Eagle
    2  sparrow 
    
    Tags assignment 
    df_A
          Types  Speed  Tags
    0   Falcon    100  FLCN
    1    Eagle     75  EGLE
    2  sparrow     50  SPRW 
    df_B
           Type  tags      ID
    0   Falcon  FLCN  543.76
    1    Eagle  EGLE  534.32
    2  sparrow  SPRW  645.25 
    
    ID assignment 
    df_A
          Types  Speed  Tags      ID
    0   Falcon    100  FLCN  543.76
    1    Eagle     75  EGLE  534.32
    2  sparrow     50  SPRW  645.25 
    df_B
           Type  tags      ID
    0   Falcon  FLCN  543.76
    1    Eagle  EGLE  534.32
    2  sparrow  SPRW  645.25
    

    Perhaps try in your full code: print(type(df_A['ID']), type(df_B['ID'])) to check if both are pandas.core.series.Series.

    For your reported workaround perhaps add ignore_index=True as pd.concat([df_A, df_B['tags']], axis=1, ignore_index=True) and fix the column names thereafter.