rloopsdata.tablematchsequencing

R: Get index of first instance in vector greater than variable but for whole colum


I am trying to create a new variable in a data.table. It is intended to take a variable in the data.table and for each observation compare that variable to a vector and return the index of the first observation in the vector that is greater than the variable in the data.table.

Example

ComparatorVector <- c(seq(1000, 200000, 1000))
Variable <- runif(10, min = 1000, max = 200000)

For each observation in Variable I'd like to know the index of the first observation in ComparatorVector that is larger than the observation of Variable.

I've played araound with min(which()), but couldn't get it to just go through the ComparatorVector. I also saw the match() function, but didn't find how to get it to return anything but the index of the exact match.


Solution

  • An option is findInterval

    findInterval(Variable, ComparatorVector) +1
    #[1] 190 152  99 107  38 148 114  95  53  73
    

    Or with sapply

    sapply(Variable, function(x) which(ComparatorVector > x)[1])
    #[1] 190 152  99 107  38 148 114  95  53  73