I have a dataframe df_a
that has a column called dr7objid
and has 194478 rows. There's a 2nd dataframe df_indices
with 245609 rows that also has an objid
column as well as an asset_id
column. Both are pictured below.
I want to add a new column to df_a
that finds the matching asset_id
in df_indices
for the specific dr7objid
in df_a
.
For example, the first value in the column in df_a
should be:
df_indices[df_indices['objid'] == 587732484359913515].index[0]+1
which is55934
(note that asset_id is equal to the index+1 in general, though one could also use the at
function with 'asset_id' as the label).
How do I generalize this to all examples without using for loops? I've tried thinking of boolean indexing and stuff but I can't figure out how to search for the corresponding objid in df_indices
efficiently.
The situation is also more complicated because confusingly, not all dr7objid's are in df_indices
, even though df_indices
is bigger, so if that's the case, the ID should be nan or 0.
One of the other way to get the desired operation is through merge as shown in example below:
df_a = df_a.merge(df_indices[['objid','asset_id']],left_on='dr7objid', right_on='objid', how='left', copy=False).drop(columns={'objid'})