rarraysstringstring-conversion

String conversion to array: Opening hours (over a week)


I've done an OSM-extraction and here you can see the column "osm_openin" for the opening hours for each object in R. It has the following structure:

enter image description here

I would love to have new columns for each day of the week, with a symbol "X" - if it is not open all day - or the according opening hours for the day "07:00 - 21:00".

My solution:

Firstly, I am thinking of using representative values for the week days "Mo = 1", "Tu = 2"..."Su = 7". It is important, if the day/value itself is not explicitly mentioned, but is exisiting in an intervall.

For each value, I am searching its existence in the column. If it finds the value, I'll take the opening hours following directly after (don't know which R command to use for that) If not, then the value has to be in an intervall. For example "2" (Tuesday) is not existing, then the script needs to realize Tuesday is between Mo-Sa. (don't know which method to use for that).

Public Holiday is not important.

Any suggestion for a solution?

Thanks.


Solution

  • I don't know the best way, but may be I can help you. Firstly we need to create array of weekdays:

    wdays <- c("Mo", "Tu", "We", "Th", "Fr", "Sa", "Su")
    

    Now let's write code for converting text from "Mo,We-Fr" to vector c(1, 3, 4, 5). Algorithm:

    1. Delete information about holidays ("PH", "SH");
    2. Replace name of weekday with number ("Mo" --> 1, "Tu" --> 2, etc.);
    3. Replace - with :. For example, 3-5 will be 3:5 and it is R-style code;
    4. Add c( to the beginning and ) to the end. For example, 1,3:5 will be c(1, 3:5);
    5. c(1, 3:5) is R-style vector and we can create vector by text (eval(parse(text = "c(1, 3:5)"))).

    Full code:

    GetWDays <- function(x, wdays) {
        holi <- c("PH", "SH")
        x <- gsub(paste0("(,|^)", holi, collapse = "|"), "", x) #delete holidays
        
        for (i in 1:7) {
            x <- gsub(wdays[i], i, x)
        }
        
        x <- gsub("-", ":", x)
        x <- paste0("c(", x, ")")
        
        wday_idx <- eval(parse(text = x))
        return(wday_idx)
    }
    

    Let's create function that has opening hours (like "Mo-Fr 6:30-19:00;Sa 09:00-17:00;Su,PH 09:00-15:00") as input and returns data.frame with 7 columns (for each weekday). Algorithm:

    1. Split text by ;; Now we will work with one part of text (for example, "Mo-Fr 6:30-19:00");
    2. Split text by (space); "Mo-Fr 6:30-19:00" --> "Mo-Fr" and "6:30-19:00"
    3. First part ("Mo-Fr") we put into GetWDays and we make vector from second part (it's size will be like as first part size). Example: "Mo-Fr" --> c(1,2,3,4,5), "6:30-19:00" --> rep("6:30-19:00", 5);
    4. Make data.frame from 2 vectors (Day and Time);
    5. Use bind_rows for each part from first step. Now we have big data.frame, but some weekdays may be missing, and some weekdays may have "Off" in column Time;
    6. So add rows for missing weekdays (by merge) and replace "Off" and NA with "X" (as you want);
    7. Transpose data.frame and return

    Full code:

    GetTimetable <- function(x) {
        wdays <- c("Mo", "Tu", "We", "Th", "Fr", "Sa", "Su")
        
        
        tmp <- strsplit(strsplit(x, ";")[[1]], " ")
        tmp <- lapply(tmp, function(x) {Day <- GetWDays(x[1], wdays); data.frame(Day, Time = rep(x[2], length(Day)))})
        tmp <- bind_rows(tmp) %>% arrange(Day) %>% as.data.frame()
        tmp <- merge(data.frame(Day = 1:7), tmp, all.x = T, by = "Day")
        tmp$Time[is.na(tmp$Time) | tmp$Time == "Off"] = "X"
        
        tmp <- tmp %>% t() %>% "["(2, ) %>% as.list() %>% setNames(wdays) %>% bind_cols()
        return(tmp)
    }
    

    If you want to apply GetTimetable for each row you can use this code:

    df_time <- df$osm_openning %>% lapply(GetTimetable) %>% bind_rows()
    

    And if you want to add this data.frame to your data you can do something like this:

    df <- bind_cols(df, df_time)