I have a large data frame filled with numbers and a second data frame with limits (a high and low acceptable range) for each column. Im wondering how I can use the high and low limits to find data that falls outside of that range for each column. I can do this with a for loop, but it is a messy solution (and I'm sure inefficient) so I'm wondering if there is another way.
For example
#Create a data frame with values ranging from 0-10
sampleData <- data.frame(replicate(9,sample(0:10,10, rep=TRUE)))
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 7 9 0 7 3 0 0 8
2 4 8 3 4 9 6 3 2 3
3 9 7 5 2 7 5 10 9 4
4 2 6 2 1 3 9 4 3 9
5 10 2 2 6 4 7 4 9 7
#Have another data frame with our limits
X1 X2 X3 X4 X5 X6 X7 X8 X9
1 1 7 3 4 7 3 0 0 3
2 4 8 9 10 9 6 3 2 8
I would want to know which rows have failed based on the values being outside our limits for that column. So failures would be
Col 1: 3,5
Col 2: 4,5
Col 3: 4,5
Col 4: 1,3,4
Col 5: 4,5
Col 6: 4,5
Col 7: 3,4,5
Col 8: 3,4,5
Col 9: 4
Thank you!
We can use base R mapply
. Assuming your limits dataframe is called limits
. We pass the columns parallely from both the dataframes and select the indices which extend the limits.
mapply(function(x, y) which(x < y[1] | x > y[2]) , sampleData, limits)
#$X1
#[1] 3 5
#$X2
#[1] 4 5
#$X3
#[1] 4 5
#$X4
#[1] 1 3 4
#$X5
#[1] 4 5
#$X6
#[1] 4 5
#$X7
#[1] 3 4 5
#$X8
#[1] 3 4 5
#$X9
#[1] 4