rplotcolorsopenair

Plot different colours based on the conditions


This is the first 10 rows of my data frame:

head(test.data,10)
# A tibble: 10 x 5
       date             o2.permeg co2.ppm  apo        o2.spike
      <time>              <dbl>   <dbl>    <dbl>       <chr>
1  2015-01-01 00:00:00   -685.09 413.023 -354.1816        N
2  2015-01-01 00:02:00   -695.10 412.894 -364.8690        N
3  2015-01-01 00:04:00   -687.84 412.979 -357.1627        N
4  2015-01-01 00:06:00   -683.23 412.866 -353.1460        N
5  2015-01-01 00:08:00   -683.28 412.755 -353.7788        N
6  2015-01-01 00:10:00   -685.40 412.647 -356.4659        N
7  2015-01-01 00:12:00   -687.80 412.659 -358.8029        N
8  2015-01-01 00:14:00   -662.79 412.665        NA        Y
9  2015-01-01 00:16:00   -684.17 412.762 -354.6321        N
10 2015-01-01 00:18:00   -680.37 412.720 -351.0526        N

As you can see there's a last column named o2.spike, which has characters N and Y in it. N means that the data point is not a spike, and Y means that it is a spike. In this sample, there's only 1 Y, but in the real frame, there are loads, and randomly placed.

My desire is to plot all the data points in a plot, and those marked with Y will be plotted in a different colour.

For your information, this is the current code that I am using to plot everything. The first 3 variables are plotted in red, green, and blue, and I want the "Y" rows to be plotted in as, for example, pink.

library(openair)
test.data$yr_day <- format(as.Date(test.data$date), "%Y-%m-%d")
dir.create(daily) # where "daily" is the path of the folder I want to save the plots into
for (d in unique(test.data$yr_day)) {
mypath <- file.path(daily, paste(name, d, ".png", sep = "" )) 
png(filename = mypath, width = 963, height = 690) 
timePlot(subset(test.data, yr_day == d), 
       plot.type = "p",
       pollutant = c("co2.ppm", "o2.permeg", "apo"), 
       y.relation = "free",
       date.pad = TRUE,
       pch = c(19,19,19),
       cex = 0.2,
       xlab = paste("Time of day in hours on", d),
       ylab = "CO2, O2, and APO concentrations",
       name.pol = c("CO2 (ppm)", "O2 (per meg)", "APO (per meg)"),
       date.breaks = 24,
       date.format = "%H:%M"
  )
 dev.off()
} 

An example plot (containing all the spikes with the same colour as the non-spike ones) is as follows: enter image description here

So how do I plot the spikes in a different colour from the others? Thank you very much!

Edit: As asked by Sebastian, I have added this (not sure how you guys will be able to extract the data from that)

dput(head(test.data,20))

structure(list(date = structure(c(1420070400, 1420070520, 1420070640, 
1420070760, 1420070880, 1420071000, 1420071120, 1420071240, 1420071360, 
1420071480, 1420071600, 1420071720, 1420071840, 1420071960, 1420072080, 
1420072200, 1420072320, 1420072440, 1420072560, 1420072680), class =    c("POSIXct", 
"POSIXt"), tzone = "GMT"), o2.permeg = c(-685.09, -695.1, -687.84, 
-683.23, -683.28, -685.4, -687.8, -662.79, -684.17, -680.37,     
-684.66, -686.13, -683.27, -680.77, -682.16, -692.54, NA, NA, 
NA, NA), co2.ppm = c(413.023, 412.894, 412.979, 412.866, 412.755, 
412.647, 412.659, 412.665, 412.762, 412.72, 412.692, 412.71, 
412.757, 412.838, 412.922, 413.019, NA, NA, NA, NA), apo =   c(-354.181646778043, 
-364.868973747017, -357.162673031026, -353.145990453461, -353.778806682578, 
-356.465871121718, -358.802863961814, NA, -354.632052505966, 
-351.052577565632, -355.489594272076, -356.86508353222, -353.75830548926, 
-350.833007159904, -351.781957040573, -361.652649164678, NA, 
NA, NA, NA), o2.spike = c("N", "N", "N", "N", "N", "N", "N", 
"Y", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N", "N"
)), .Names = c("date", "o2.permeg", "co2.ppm", "apo", "o2.spike"
), row.names = c(NA, -20L), class = c("tbl_df", "tbl", "data.frame"
))

Solution

  • Unfortunately, without having data, it's not easy to answer the question. A ggplot2 solution could be:

    g1 <- ggplot(data=test.data, aes(x=date, y=o2.permeg, col=o2.spike)) + geom_point()
    g1
    

    Passing a column of the dataframe to parameter "col" in "aes" makes you map with different colors every different value in that column. It creates even a legend, with names associated to different colors.

    I tried this with another dataframe ("iris", contained in base R) and it worked, hope it will be helpful.

    Edit:

    To have side-by-side plots, you can create 3 plots with ggplot and the use the function plot_grid() provided by "cowplot" package.

    library(cowplot)
    g1 <- ggplot(data=test.data, aes(x=date, y=o2.permeg, col=o2.spike)) + geom_point()
    g2 <- ggplot(data=test.data, aes(x=date, y=co2.ppm, col=o2.spike)) + geom_point()
    g3 <- ggplot(data=test.data, aes(x=date, y=apo, col=o2.spike)) + geom_point()
    plot_grid(g1, g2, g3, nrow=3, ncol=1)