I would like to share a strange behavior of pandas, and find out the reason : I assign a numpy array as an object to 1 element (cell, entry) of a pandas dataframe in 2 different ways :
first create a sample dataframe :
rn = np.random.randint(1 , 100, size=(4,2)) # random numbers
df = pd.DataFrame(data=rn , columns=['a' , 'b' ])
df['b'] = df['b'].astype(object) # setting 1 column's data-type as 'object'.
c = np.array([1,4,4]) # I want to put this in 1 entry of the dataframe :
method 1 :
df['b'].loc[0] = c
successful, but there is a warning :
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
method 2 :
df.loc[0 , 'b'] = c
unsuccessful with the following error :
ValueError: Must have equal len keys and value when setting with an iterable
Why is that ?
This is a quirk (not to say an inconsistency) in Pandas indexing. It sees one single slot on the left side of the assignation and an iterable of 3 values on the right side and chokes on that.
The only way I have found is to force Pandas to use an index:
df.loc[0:0, 'b' ] = pd.Series([c], [0])
It gives neither error nor warning and you can control the result:
print(type(df.loc[0, 'b']), df, sep='\n')
which gives as expected:
<class 'numpy.ndarray'>
a b
0 45 [1, 4, 4]
1 65 68
2 68 10
3 84 22
Rather hacky, but at least provides the expected behaviour...