I have weather data recorded at 15-minute intervals in R, and I need to format the data such that each day's
data is followed by two blank rows
with the corresponding date
written on the top
. Here's a glimpse of the required
output:
Desired export
file format
I have a dataframe (df) with columns date, temperature, precipitation, and wind_speed and looks like this. How can I achieve this formatting in R?
Here is the reproducible example:
df <- structure(list(date = structure(c(1401104700, 1401105600, 1401106500,
1401107400, 1401108300, 1401148800, 1401149700, 1401150600, 1401151500,
1401152400, 1401235200, 1401236100, 1401237000, 1401237900, 1401238800
), class = c("POSIXct", "POSIXt"), tzone = "UTC"), temperature = c(25,
25.2, 25.3, 25.1, 25.4, 18.6, 18.3, 18.2, 18.2, 18.2, 19.7, 19.1,
18.7, 18.5, 18.3), precipitation = c(0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0), wind_speed = c(1.006, 1.006, 0.9054, 0.6036,
0.4024, 0.1006, 0.2012, 0.503, 0.1006, 0, 0, 0.1006, 0.2012,
0.1006, 0.2012)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-15L))
I first create a function that can take the data for one day from your data frame and add the desired rows at the beginning of that block. It also converts the dates to the format you show in your question. I use the function paste_comma()
to add commas after all the values.
paste_comma <- function(x) paste0(x, ",")
add_header <- function(df) {
day <- format(df$date[1], "%d_%m_%y")
header <- tibble(
date = c("", "", day, "Time,"),
temperature = c("", "", "", "Temp. (f),"),
precipitation = c("", "", "", "Precipitation,"),
wind_speed = c("", "", "", "Wind Speed,")
)
df <- df %>%
mutate(date = format(date, format = "%Y%m%d%H%M PST")) %>%
mutate(across(everything(), paste_comma))
bind_rows(header, df)
}
Next, I group the data frame by day and apply the function to each group. I use reframe()
because I want to return multiple rows per group.
df_new <- df %>%
group_by(day = floor_date(date, "day")) %>%
reframe(add_header(pick(everything()))) %>%
select(-day)
When you write the tsv file, make sure to omit the headers:
write_tsv(df_new, "data.tsv", col_names = FALSE)
This is how the output looks in Excel: