rindexingdplyrsequential

How to use dplyr to fill in gaps in ranked and sorted index?


I am working on an R function that generates a ranked and sorted index with user inputs for starting values (in a list) and a total number of slots to fill for the index. If the list values count is < total number of slots, then sequential numbers are inserted into the gaps. Note that the first index slot in all cases must always = 1 (if 1.1 is not provided in the list) or 1.1 (if 1.1 is provided in the list).

I used the dplyr::dense_rank function in the Reproducible Code for Example 1 at the bottom of this post to correctly fill in the gaps sequentially when the provided list elements are all < the total number of slots to fill.

Is there a way to use dplyr::dense_rank, or another way/function, to fill in the gaps when the list elements are all > than [1 or 1.1] as illustrated in Examples 2 and 3 in the images below, or when there are other gaps between the list elements as illustrated in Example 4 in the image below? Gaps I'm trying to fill are highlighted in yellow in the images. Note that the Reproducible Code at the bottom provides the user inputs for Examples 2-4, commented-out since I ran Example 1.

enter image description here enter image description here enter image description here

Example 1 Reproducible Code output (which is correct, given the Value and totalSlots inputs):

# A tibble: 5 x 2
   Slot Value
  <int> <dbl>
1     1   1.1
2     2   1.2
3     3   2.1
4     4   2.2
5     5   3 

Reproducible Code:

library(dplyr)

# Example 1:
Value <- c(2.1, 1.2, 1.1, 2.2)
totalSlots <- 5

# Example 2:
# Value <- c(2.1, 2.2)
# totalSlots <- 3
# 
# # Example 3:
# Value <- c(4.1, 4.2, 4.3)
# totalSlots <- 6

# Example 4:
# Value <- c(1.1, 1.2, 3.1, 3.2, 3.3, 6.1, 6.2)
# totalSlots <- 10

tibble(Value) %>% 
  mutate(Slot = row_number()) %>% 
  complete(Slot = seq_len(totalSlots)) %>% 
  mutate(
    Value = coalesce(Value[order(Value)], Slot), 
    Value = dense_rank(as.integer(Value)) + Value - as.integer(Value)
    )

Here is the Richard Berry solution, generating a 2-column dataframe:

indexDF <- data.frame(Slot = c(1:totalSlots), Value = sort(c(setdiff(1:totalSlots, floor(Value)), Value))[1:totalSlots])
indexDF

Solution

  • You can achieve this with:

    sort(c(setdiff(1:totalSlots, floor(Value)), Value))[1:totalSlots]
    

    Breaking it down:

    1:totalSlots %>% #candidates for integers to fill gaps
      setdiff(floor(Value)) %>% #remove fill integers already covered by Value
      c(Value) %>% #combine with Value
      sort() #get in order
    

    Then take as many elements as you are interested in with [1:totalSlots]