I have two dataframe:
dfr1
qq
date
1956-01-01 685.519348
1956-01-02 731.868500
1956-01-03 510.579375
1956-01-04 412.347250
1956-01-05 358.297625
2010-12-27 3.992000
2010-12-28 1.099583
2010-12-29 104.428958
2010-12-30 4.932750
2010-12-31 101.737292
2013-12-27 7.992000
2013-12-28 105.099583
2013-12-29 104.428958
2013-12-30 102.932750
2013-12-31 101.737292
and
dfr2
q_t
01-01 61.629342
01-02 61.409750
01-03 61.309208
01-04 61.161462
01-05 61.020508
12-27 69.065375
12-28 68.935908
12-29 68.603104
12-30 68.474458
12-31 68.209075
As you can notice, the days and month are the same in both but the first one has year while the second one not.
I would like to compare the first with the second. In particular, I would like to know when the qq value in the first one is less or equal to the value in the second according to the day and month of the second. This is thus what I expects:
1956-01-01 685.519348 61.629342 False
1956-01-02 731.8685 61.40975 False
1956-01-03 510.579375 61.309208 False
1956-01-04 412.34725 61.161462 False
1956-01-05 358.297625 61.020508 False
2010-12-27 3.992 69.065375 True
2010-12-28 1.099583 68.935908 True
2010-12-29 104.428958 68.603104 False
2010-12-30 4.93275 68.474458 True
2010-12-31 101.737292 68.209075 False
2013-12-27 7.992 69.065375 True
2013-12-28 105.099583 68.935908 False
2013-12-29 104.428958 68.603104 False
2013-12-30 102.93275 68.474458 False
2013-12-31 101.737292 68.209075 False
I tried compared and, as expected, I got an error:
dfr1.compare(dfr1)
*** ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects
I tried also:
dfr_1['new'] = dfr_1 < dfr_2
*** ValueError: Can only compare identically-labeled (both index and columns) DataFrame objects
I see thus two problem:
Specifically, I am not able to select properly the indexes properties.
What do you think?
# create some joining columns called mmdd, e.g. '12-31'
dfr1['mmdd'] = dfr1['date'].dt.strftime('%m-%d')
dfr2.index.name = 'mmdd'
pd.merge(dfr1, dfr2, on='mmdd', how = 'left').drop('mmdd', axis=1).assign(lt= lambda x: x['qq'] < x['q_t'])
Output:
date qq q_t lt
0 1956-01-01 685.519348 61.629342 False
1 1956-01-02 731.868500 61.409750 False
2 1956-01-03 510.579375 61.309208 False
3 1956-01-04 412.347250 61.161462 False
4 1956-01-05 358.297625 61.020508 False
5 2010-12-27 3.992000 69.065375 True
6 2010-12-28 1.099583 68.935908 True
7 2010-12-29 104.428958 68.603104 False
8 2010-12-30 4.932750 68.474458 True
9 2010-12-31 101.737292 68.209075 False
10 2013-12-27 7.992000 69.065375 True
11 2013-12-28 105.099583 68.935908 False
12 2013-12-29 104.428958 68.603104 False
13 2013-12-30 102.932750 68.474458 False
14 2013-12-31 101.737292 68.209075 False