I have a data frame with columns of years, months, days
, and hours
. I want to add a column in which each row represents the POSIXlt
object defined by the years, months, days
, and hours
on that row.
The conversion for each row is straightforward, for example:
library(dplyr)
library(string)
library(lubridate)
years <- 2022
months <- 2
day <- 25
hours <- 13
pos_times <- c(years, months, day, hours) %>%
str_c(collapse = " ") %>%
ymd_h
pos_times %>%
str
pos_times
And that yields the following correct output:
> pos_times %>%
+ str
POSIXct[1:1], format: "2022-02-25 13:00:00"
> pos_times
[1] "2022-02-25 13:00:00 UTC"
But when I try the operation for more than one set of years, months, days, hours
to produce the new column as a vector, I use the following code:
df <- data.frame(years = c(2010, 2011),
month = c(11, 12),
day = c(1, 2),
hour = c(3, 5))
N <- nrow(df)
vec_time <- rep(NA, N)
for(i in 1:N){
pos_time <- (df[i, 1:4]) %>%
str_c(collapse = " ") %>%
ymd_h
print(paste("Structure of calculated object for row number", i))
pos_time %>% str
vec_time[i] <- pos_time
}
print("Structure of vector of calculated objects")
vec_time %>%
str
Its output is wrong:
[1] "Structure of calculated object for row number 1"
POSIXct[1:1], format: "2010-11-01 03:00:00"
[1] "Structure of calculated object for row number 2"
POSIXct[1:1], format: "2011-12-02 05:00:00"
> print("Structure of vector of calculated objects")
[1] "Structure of vector of calculated objects"
> vec_time %>%
+ str
num [1:2] 1.29e+09 1.32e+09
In the calculation in each line pos_time
is again correctly shown as a POSIXlt
object, but the values of the vector vec_time
are numeric.
I realise that a POSIXlt
object is made from just a number but I want my data frame to show the POSIXlt
objects as such.
The ymd_h()
function returns the date-time objects as POSIXct objects, which are stored as numeric values representing the number of seconds since the Unix epoch (January 1, 1970, 00:00:00 UTC). Using format your code will work:
Change your code to this:
N <- nrow(df)
vec_time <- rep(NA_character_, N)
for(i in 1:N){
pos_time <- ymd_h(paste(df[i, 1:4], collapse = "-"), tz = "UTC")
vec_time[i] <- format(pos_time, format = "%Y-%m-%d %H:%M:%S")
}
df$datetime <- vec_time
df
I would suggest this:
library(dplyr)
library(lubridate)
df %>%
mutate(datetime= ymd_h(paste(years, month, day, hour, sep = "-")))
years month day hour datetime
1 2010 11 1 3 2010-11-01 03:00:00
2 2011 12 2 5 2011-12-02 05:00:00