rplotpercentagerectanglesplotrix

Plotting Rectangles with Fill Gradients Determined by Group and Proportions


I am trying to plot a bunch of rectangles with the fill being color gradients determined by a proportion.

To start with, I have a Data Frame like so:

sampleID 1 2 3 4 5 ... 100
sample1  1 1 1 1 1 ... 1
sample2  1 1 1 1 1 ... 1
sample3  2 2 2 2 2 ... 2
sample4  2 2 1 1 2 ... 2
...

where the integers correspond to group assignments for 100 runs of an analysis I did. I want samples with mixed group assignments to have a color gradient (e.g., sample4 is 25% blue and 75% red).

Here's my code:


library("RColorBrewer")
library("plotrix")

# Make sampleID column rownames and remove from df
col2rownames <- function(df){
  rownames(df) <- df$sampleID
  df$sampleID <- NULL
  return(df)
}

df <- col2rownames(df)

# Get a list of frequency tables corresponding to each row in df.
df.freq <- apply(df, 1, table)

# Convert list of table() objects to list of data.frames
df.freq <- lapply(df.freq, function(x) { as.data.frame(x, stringsAsFactors = F) } )

# Make color vector
colors <- c(
  "1" = "#808080", # BXCH
  "2" = "purple4", # BXON
  "3" = "yellow3", # BXFL
  "4" = "orange1", # BXEA
  "5" = "mediumaquamarine", # BXFL second cluster
  "6" = "magenta3", # GUFL
  "7" = "blue", # GUMS
  "8" = "red", # BXMX
  "9" = "green2", # BXTT
  "10" = "#00ffff" # BXDS
)

# Subset only colors present in each df.freq data.frame
collist <- list()
collist <- lapply(df.freq, function(x) { 
  colors[x[, 1]]
})

# Convert to list of vectors
collist <- lapply(collist, as.vector)

# Get number of data frames in list
mylen <- length(df.freq)

# Plot an empty box
plot(1:mylen, type="n", axes=F)

# Initialize counters
counter_min <- 0
counter_max <- 1

# Initialize newcollist
newcollist <- list()

# Plot rectangles with color gradient
for (i in 1:length(collist)){ 
  colsubset <- c(collist[[i]])
  newcollist[[i]] <- colsubset
  gradient.rect(xleft = 0, ybottom = counter_min, xright = 5, ytop = counter_max, col = colsubset, gradient = "x")
  counter_min <- counter_max
  counter_max <- counter_min + 1
}

And here's my current output:

Rectangle Gradient Plot

However, the rectangles with >1 color are shown as 50/50, which is not the correct proportions. For example, the purple and light blue near the top are actually supposed to be 88% and 12%, respectively.

I am stuck here. Does anyone know a way to plot the color fills by proportions?

Thanks very much for your time.


Solution

  • Ok, so this is more an issue of the gradient.rect() function. It's simply not made for what you want. No matter what, it will always yield equally split rectangles.

    However, that doesn't mean you can't make your plot. You'll just have to use the good'ol rect() function and compute the splits yourself.

    I made what dummy data I could from your post...

    df <- "sampleID,1,2,3,4,5
    sample1,1,1,1,1,1
    sample2,1,1,1,1,1
    sample3,2,2,2,2,2
    sample4,2,2,1,1,2
    sample5,3,2,1,1,2
    sample6,4,4,4,1,2
    sample7,2,2,1,2,2"
    df <- read.table(text = df, h = T, sep = ",", row.names = 1)
    

    This is all unchanged :

    col2rownames <- function(df){
      rownames(df) <- df$sampleID
      df$sampleID <- NULL
      return(df)
    }
    df <- col2rownames(df)
    df.freq <- apply(df, 1, table)
    df.freq <- lapply(df.freq, function(x) { as.data.frame(x, stringsAsFactors = F) } )
    colors <- c(
      "1" = "#808080", # BXCH
      "2" = "purple4", # BXON
      "3" = "yellow3", # BXFL
      "4" = "orange1", # BXEA
      "5" = "mediumaquamarine", # BXFL second cluster
      "6" = "magenta3", # GUFL
      "7" = "blue", # GUMS
      "8" = "red", # BXMX
      "9" = "green2", # BXTT
      "10" = "#00ffff" # BXDS
    )
    collist <- list()
    collist <- lapply(df.freq, function(x) { 
      colors[x[, 1]]
    })
    collist <- lapply(collist, as.vector)
    mylen <- length(df.freq)
    

    Here is the new stuff :

    # Plot an empty box
    plot(c(0,1), c(0, mylen), type="n", axes=F)
    
    # Initialize counter (you don't really need 2 for this...)
    counter <- 0
    
    # Plot rectangles of given colors, split by given freqs
    rect_split <- function(freqs, colors, ybot, ytop, xleft = 0, xright = 1, ...){
      freqs <- freqs/sum(freqs) # norm to 1
      xpos <- c(0, cumsum(freqs)) # get splits for colors
      xpos <- (xpos - xleft)/(xright - xleft) # scale between xleft and xright
      sapply(seq_along(freqs), function(i){
        rect(xleft = xpos[i],  xright = xpos[i+1], ybottom = ybot, ytop = ytop, col = colors[i], ...)
      })
    }
    
    for (i in 1:length(collist)){ 
      cols <- c(collist[[i]])
      freqs <- df.freq[[i]][, 2] # assuming the freqs are in the order of the colors
    
      rect_split(freqs, cols, ybot = counter, ytop = counter + 1)
      counter <- counter + 1
    }
    

    This plots :

    enter image description here