I have a pandas dataframe with a number of thresholds and values associated with epochs. I want to compare the all of the thresholds with their associated values simultaneously to remove rows as needed. I will be doing this many times and the letter designations can change each time I create this dataframe, but there will always be a 1:1 association for threshold and value. The number of thresholds and values can change each time the dataframe is created though. The end goal is to get all of the epochs where all of the values are below their respective thresholds. The sample below does not have any that fit this requirement, but I've included it so people have a visual for what I'm working with. For now, how would I do this comparison in an efficient way?
epoch thresholdA thresholdB thresholdC thresholdD thresholdE thresholdF thresholdG valueA valueB valueC valueD valueE valueF valueG
0 1723489899000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 0.292969 0.00000 19.775391
1 1723489900000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 0.292969 0.00000 19.775391
2 1723489901000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 0.292969 0.00000 19.775391
3 1723489902000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.780273 2.929688 0.00000 31.054688
4 1723489903000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.780273 2.929688 0.00000 31.054688
5 1723489904000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.780273 2.929688 0.00000 31.054688
6 1723489905000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 31.347656
7 1723489906000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 31.347656
8 1723489907000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 31.347656
9 1723489908000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 31.347656
10 1723489909000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 31.347656
11 1723489910000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 31.347656
12 1723489911000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 37.500000
13 1723489912000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 37.500000
14 1723489913000 30 120 30 30 120 30 2 299.311523 -3.828125 -2.841797 199.541016 2.929688 0.00000 37.500000
15 1723489914000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 40.869141
16 1723489915000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 40.869141
17 1723489916000 30 120 30 30 120 30 2 299.311523 -4.785156 -2.841797 199.541016 2.929688 0.00000 40.869141
18 1723489950000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 3.222656
19 1723489951000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 3.222656
20 1723489952000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 3.222656
21 1723489953000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 0.439453
22 1723489954000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 0.439453
23 1723489955000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 0.439453
24 1723489956000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 0.585938
25 1723489957000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 0.585938
26 1723489958000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 0.585938
27 1723489959000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 0.292969 81.37207 0.439453
28 1723489960000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 0.292969 81.37207 0.439453
29 1723489961000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 0.292969 81.37207 0.439453
30 1723489962000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 1.171875
31 1723489963000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 1.171875
32 1723489964000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 1.171875
33 1723489965000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 5.566406
34 1723489966000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 5.566406
35 1723489967000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 2.929688 81.37207 5.566406
36 1723489968000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 6.005859
37 1723489969000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 6.005859
38 1723489970000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 6.005859
39 1723489971000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 2.929688 81.37207 4.687500
40 1723489972000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 2.929688 81.37207 4.687500
41 1723489973000 30 120 30 30 120 30 2 199.301758 -3.828125 -2.841797 129.677750 2.929688 81.37207 4.687500
42 1723489974000 30 120 30 30 120 30 2 199.301758 -4.785156 -2.841797 129.677750 0.292969 81.37207 4.833984
Here's one approach:
df.filter
for both 'value*' and 'threshold*' columns. For 'threshold*', chain df.values
to allow element-wise comparison on shape, rather than on column labels.df.all
row-wise (axis=1
) and use for boolean indexing.import pandas as pd
data = {'epoch': {0: 1723489899000, 1: 1723489900000},
'thresholdA': {0: 30, 1: 30}, 'thresholdB': {0: 120, 1: 120},
'valueA': {0: 299.311523, 1: 10}, 'valueB': {0: -4.785156, 1: -4.785156}}
df = pd.DataFrame(data)
epoch thresholdA thresholdB valueA valueB
0 1723489899000 30 120 299.311523 -4.785156
1 1723489900000 30 120 10.000000 -4.785156
# aim: delete row 0, since valueA >= thresholdA
Code
m = (df.filter(regex=r'^value') < df.filter(regex=r'^threshold').values).all(axis=1)
df[m]
epoch thresholdA thresholdB valueA valueB
1 1723489900000 30 120 10.0 -4.785156
N.B. The above assumes not just a "1:1 association for threshold and value", but also that the columns occur in the same order. If not, you should sort them first.