rdendrogram

Using dendrogram in R (or other methods) how can I produce a plot of individuals' ID with same number


I would like to visualize individuals with the same numbers using a dendrogram .

dat <- read.table(text="Id  Resistance.Number    
A   0405611540527610     
B   6607734653564010  
C   0405611540527610 
D   030402066763231     
E   0405611540527610  
F   4417610555422211
G   0043551103500252     
H   1043453103501272    
I   030402066763231", header=TRUE)

Solution

  • You could create a dendrogram based on string similarities using stringsimmatrix from the stringdist package. After that make sure your matrix has the right column and row names to convert it to a dist object which can be used to create a dendogram like this:

    library(stringdist)
    
    # Similarities
    similarities <- stringsimmatrix(dat$Resistance.Number)
    colnames(similarities) <- dat$Id
    rownames(similarities) <- dat$Id
    
    # Plot
    d <- dist(similarities)
    d <- hclust(d)
    dend <- as.dendrogram(d)
    plot(dend)
    

    Created on 2023-02-01 with reprex v2.0.2


    Data:

    I created this data to make sure the Numbers were as character like this:

    dat <- data.frame(Id = LETTERS[1:9],
                      Resistance.Number = c("0405611540527610", "6607734653564010", "0405611540527610", "030402066763231", "0405611540527610",
                                            "4417610555422211", "0043551103500252", "1043453103501272", "030402066763231"))