I have one data frame with more than 1000 rows, and I have one column duracao
which I want to sum the values that have there HH:MM
, and resulting also the value as HH:MM
.
df <- data.frame(duracao = c("00:29", "00:26", "00:38"))
#The sum should be 01:33
First I tried:
class(df$duracao)
>character
df$duracao<-as.numeric(df$duracao)
df <- df %>%
mutate(duracao = seconds_to_period(duracao * 86400))
df <- df %>% #aqui
mutate(duracao = sprintf("%02d:%02d", hour(duracao), minute(duracao)))
total_duracao <- sum(df$duracao)
Then second try:
df$duracao <- as.POSIXct(df$duracao, format = "%H:%M", tz = "UTC")
df <- df %>%
mutate(duracao_sec = as.numeric(duracao - as.POSIXct("1970-01-01", tz = "UTC")))
total_seconds <- sum(df$duracao_sec)
total_hours <- total_seconds %/% 3600
total_minutes <- (total_seconds %% 3600) %/% 60
total_duracao <- sprintf("%02d:%02d", total_hours, total_minutes)
print(total_duracao)
Update
I found these weird values in the column
I ran this code
which(!grepl("\\d{2}:\\d{2}", df$duracao))
And the values were:
-1:-20 , -1:-00 and -1:-11
it should be 1:20, 1:00 and 1:11, how can I change it, please?
library(dplyr)
library(lubridate)
df |>
mutate(duracao = gsub("-", "", duracao, fixed = TRUE),
period = as.duration(hm(duracao))) |>
summarize(period = seconds_to_period(sum(period)),
hh_mm = paste0(sum(day(period * 24) + hour(period)), ":", minute(period)))
# period hh_mm
# 1 1H 33M 0S 1:33
If you don't want the values in a data frame then you can simply use this pipe chain (and ignore the library(dplyr)
statement):
library(lubridate)
df$duracao |>
gsub("-", "", x = _, fixed = TRUE) |>
hm() |>
as.duration() |>
sum() |>
seconds_to_period()
Note that this returns a Period object. This makes it easy to extract components if you need:
total_duracao <- df$duracao |>
gsub("-", "", x = _, fixed = TRUE) |>
hm() |>
as.duration() |>
sum() |>
seconds_to_period()
hour(total_duracao)
# [1] 1
minute(total_duracao)
# [1] 33
second(total_duracao)
# [1] 0