rdataframealiasnewsequentialid

Is there an R function to sequentially assign a code to each value in a dataframe, in the order it appears within the dataset?


I have a table with a long list of aliased values like this:

> head(transmission9, 50)
# A tibble: 50 x 2
   In_Node  End_Node
   <chr>    <chr>   
 1 c4ca4238 2838023a
 2 c4ca4238 d82c8d16
 3 c4ca4238 a684ecee
 4 c4ca4238 fc490ca4
 5 28dd2c79 c4ca4238
 6 f899139d 3def184a

I would like to have R go through both columns and assign a number sequentially to each value, in the order that an aliased value appears in the dataset. I would like R to read across rows first, then down columns. For example, for the dataset above:

   In_Node  End_Node
   <chr>    <chr>   
 1  1       2
 2  1       3
 3  1       4
 4  1       5
 5  6       1
 6  7       8

Is this possible? Ideally, I'd also love to be able to generate a "key" which would match each sequential code to each aliased value, like so:

Code Value
1    c4ca4238
2    2838023a
3    d82c8d16
4    a684ecee
5    fc490ca4

Thank you in advance for the help!


Solution

  • A dplyr version

    library(tidyverse)
    
    transmission9 <- read.table(header = T, text = "   In_Node  End_Node
     1 c4ca4238 283802d3a
     2 c4ca4238 d82c8d16
     3 c4ca4238 a684ecee
     4 c4ca4238 fc490ca4
     5 28dd2c79 c4ca4238
     6 f899139d 3def184a")
    

    Do this simply

    transmission9 %>% 
      mutate(across(everything(), ~ match(., unique(c(t(cur_data()))))))
    #>   In_Node End_Node
    #> 1       1        2
    #> 2       1        3
    #> 3       1        4
    #> 4       1        5
    #> 5       6        1
    #> 6       7        8
    

    use .names argument if you want to create new columns

    transmission9 %>% 
      mutate(across(everything(), ~ match(., unique(c(t(cur_data())))),
                    .names = '{.col}_code'))
    
       In_Node End_Node In_Node_code End_Node_code
    1 c4ca4238 2838023a            1             2
    2 c4ca4238 d82c8d16            1             3
    3 c4ca4238 a684ecee            1             4
    4 c4ca4238 fc490ca4            1             5
    5 28dd2c79 c4ca4238            6             1
    6 f899139d 3def184a            7             8