rstringheatmapvisualize

visualize numerical strings as a matrixed heatmap


I'm trying to visualize a matrix of numerical strings, as a heatmap Take this example, of 36 element length "History" numerical strings, and, say I have 6 rows (i actually have 500 rows). I want to visualize a heat map of a matrix of 6x36 "pixels or cells". Additionally it would be great to sort or split them visually by True/False on the "Survive" variable.

    testdata=                   
       History                                Survive
    1  111111111111111211111111111111111111   FALSE
    2  111111111111111110000000000000000000   TRUE
    3  000111222111111111111111111111110000   FALSE
    4  111111111111111111111111100000000000   TRUE
    5  011231111111111111111111111111111111   FALSE
    6  111111234111111111111111110000000000   TRUE

Solution

  • Here is one idea. We can split the Histroy column and then created rowid and ID column to plot the data as a heatmap.

    library(tidyverse)
    
    testdata2 <- testdata %>% mutate(History = str_split(History, pattern = "")) 
    
    testdata3 <- testdata2%>%
      rowid_to_column() %>%
      unnest() %>%
      group_by(rowid) %>%
      mutate(ID =row_number()) 
    
    p <- ggplot(testdata3, aes(x = ID, y = rowid, fill = History)) +
      geom_tile(color = "black") +
      scale_fill_brewer() +
      scale_y_reverse() +
      labs(x = "", y = "") +
      theme_minimal()
    
    print(p)
    

    enter image description here

    If we want to plot the data as facets by TRUE and FALSE in the Survival column, we need to create the rowid separately as TRUE and FALSE for Survival.

    testdata4 <- testdata2%>%
      group_by(Survive) %>%
      mutate(rowid = row_number()) %>%
      unnest() %>%
      group_by(Survive, rowid) %>%
      mutate(ID = row_number()) 
    
    p2 <- ggplot(testdata4, aes(x = ID, y = rowid, fill = History)) +
      geom_tile(color = "black") +
      scale_fill_brewer() +
      scale_y_reverse() +
      labs(x = "", y = "") +
      theme_minimal() +
      facet_grid(~ Survive)
    
    print(p2)
    

    enter image description here

    Data

    testdata <- read.table(text =                    
        "  History                                Survive
        1  111111111111111211111111111111111111   FALSE
        2  111111111111111110000000000000000000   TRUE
        3  000111222111111111111111111111110000   FALSE
        4  111111111111111111111111100000000000   TRUE
        5  011231111111111111111111111111111111   FALSE
        6  111111234111111111111111110000000000   TRUE",
        header = TRUE, stringsAsFactors = FALSE,
        colClasses = c("numeric", "character", "logical"))