rboxplotrnw

How to change the color of outliers of certain category in boxplot()?


Put simply, I want to color outliers, but only if they belong to specific category, i.e. I want

boxplot(mydata[,2:3], col=c("chartreuse","gold"), outcol="red")

but red only for those elements for which mydata[,1] is M .


Solution

  • It appears that outcol only specifies one color per variable (box). However, you can use points to overplot individual points any way that you want. You need to figure out the relevant x and y coordinates to use for plotting. When you make a boxplot with a statement like boxplot(mydata[,2:3]) the first variable (column 2) is plotted at x=1 and the second variable (column 3) is plotted at x=2. By capturing the return value of boxplot you can figure out the y values. Since you do not provide any data, I will illustrate with randomly generated data.

    ## Data
    set.seed(42)
    NumPts = 400
    a = rnorm(NumPts)
    b = rnorm(NumPts)
    c = rnorm(NumPts)
    CAT = sample(c("M", "N"), NumPts, replace=T)
    mydata = data.frame(a,b,c, CAT)
    
    ## Find outliers
    BP = boxplot(mydata[,2:3], col=c("chartreuse","gold"))
    OUT2 = which(mydata[,2] %in% BP$out)
    OUT3 = which(mydata[,3] %in% BP$out)
    
    ## Find outliers with category == M
    M_OUT2 = OUT2[which(mydata$CAT[OUT2] == "M")]
    M_OUT3 = OUT3[which(mydata$CAT[OUT3] == "M")]
    
    ## Plot desired points
    points(rep(1, length(M_OUT2)),mydata[M_OUT2, 2], col="red")
    points(rep(2, length(M_OUT3)),mydata[M_OUT3, 3], col="red")
    

    Boxplot with selected points colored