rselectdplyrpurrrmultiple-matches

Dplyr select_ and starts_with on multiple values in a variable list part 2


This is a continuation from my question earlier: Dplyr select_ and starts_with on multiple values in a variable list

I am collecting data from differnt sensors in various locations, data output is something like:

df<-data.frame(date=c(2011,2012,2013,2014,2015),"Sensor1 Temp"=c(15,18,15,14,19),"Sensor1 Pressure"=c(1001, 1000, 1002, 1004, 1000),"Sensor1a Temp"=c(15,18,15,14,19),"Sensor1a Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 Temp"=c(15,18,15,14,19),"Sensor2 Pressure"=c(1001, 1000, 1002, 1004, 1000), "Sensor2 DewPoint"=c(10,11,10,9,12),"Sensor2 Humidity"=c(90, 100, 90, 100, 80))

The problem is (I think) similar to: Using select_ and starts_with R or select columns based on multiple strings with dplyr

I want to search for sensors for example by location so I have a list to search through the dataframe and also include the timestamp. But searching falls apart when I search for more than one sensor (or type of sensor etc). Is there a way of using dplyr (NSE or SE) to achieve this?

FindLocation = c("date", "Sensor1", "Sensor2")
df %>% select(matches(paste(FindLocation, collapse="|"))) # works but picks up "Sensor1a" and "DewPoint" and "Humidity" data from Sensor2 

Also I want to add mixed searches such as:

 FindLocation = c("Sensor1", "Sensor2") # without selecting "Sensor1a"
 FindSensor = c("Temp", "Pressure") # without selecting "DewPoint" or "Humidity"

I am hoping the select combines FindSensor with FindLocation and selects Temp and Pressure data for Sensor1 and Sensor2 (without selecting Sensor1a). Returning the dataframe with the data and the columns headings:

date, Sensor1 Temp, Sensor1 Pressure, Sensor2 Temp, Sensor2 Pressure

Many thanks again!


Solution

  • Some functions from purrr are going to be useful. First, you use cross2 to compute the cartesian product of FindLocation and FindSensor. You'll get a list of pairs. Then you use map_chr to apply paste to them, joining the location and sensor strings with a dot (.). Then you use the one_of helper to select the colums.

    library(purrr)
    
    FindLocation = c("Sensor1", "Sensor2")
    FindSensor = c("Temp", "Pressure")
    
    columns = cross2(FindLocation, FindSensor) %>%
      map_chr(paste, collapse = ".")
    
    df %>% select(one_of(columns))