I am fairly new to R and currently working on visualizing some health data (BPM, Intervals and so on). I have roughly 5 days worth of data. So far I had no problem plotting some graphs of the progress throughout the day using ggplot (see fig. 1). Here's my code for that:
twentyeight %>%
ggplot(aes(x = Date28, y = as.numeric(BPM28), group = 1)) +
geom_line() +
scale_x_datetime(
date_breaks = "1 hour",
date_labels = "%d.%m.%Y %H:%M",
name = "Date"
) +
scale_y_continuous(name = "BPM")
As I have ~1 data entry per second and therefore thousands over a day, I am struggling to make the graph more visually pleasing. I'd like to try and stretch the x-axis a bit wider to include more finer details of the plot (even though I know this affects the ratio of the plot). I have tried using aspect-ratio like this:
+ theme(axis.text.x = element_text(angle = 90, vjust = 0.5), aspect.ratio = 0.2)
but this only results in a smaller plot. I have also tried using coord_fixed(ratio = ...)
but I'm getting an error of non-compatible types, ggplot and date. Is the only way choosing a smaller timeframe?
Up front: The question is very much about data visualization which is inherently more about the art and less about strict programming, and therefore prone to opinions. I offer this answer merely as a very short list of suggestions in data-vis, I don't think they are necessarily the best nor the only options.
Here are some thoughts. For reference, I'm using my heartrate over 5 days as captured by my garmin watch. (I apologize for not sharing the data.) Most of the data is every second, but there are a few instances of HR=0 (though I'm still alive) and some gaps; I'm filtering out the former and keeping the latter for demonstration.
str(quux)
# tibble [5,349 × 2] (S3: tbl_df/tbl/data.frame)
# $ timestamp : POSIXct[1:5349], format: "2025-06-01 16:59:00" "2025-06-01 17:00:00" "2025-06-01 17:01:00" "2025-06-01 17:02:00" ...
# $ heart_rate: int [1:5349] 52 54 50 49 50 51 47 49 46 47 ...
A quick plot without any effort:
ggplot(quux, aes(timestamp, heart_rate)) +
geom_line()
As mentioned, there are a few things we can try to pretty-up here. I'll add two forms of smoothing, one using a rolling-mean and plotting it literally, one using geom_smooth()
. Some of the parameters are just "from the hip", feel free to find values that work well with your data.
quux |>
mutate(smooth_HR = zoo::rollapply(heart_rate, 20, FUN = mean, align = "center", fill = NA)) |>
ggplot(aes(timestamp, heart_rate)) +
geom_line(color = "gray80") +
geom_line(aes(y = smooth_HR), na.rm = TRUE) +
geom_smooth(method = "gam", formula = y ~ splines::ns(x, 25))
There are a couple of gaps in the data (e.g., June 6), let's remove gaps 30 seconds or over:
quux |>
mutate(grp = consecutive_id(c(FALSE, diff(timestamp) > 30))) |>
mutate(.by = grp, smooth_HR = zoo::rollapply(heart_rate, 20, FUN = mean, align = "center", fill = NA)) |>
ggplot(aes(timestamp, heart_rate, group = grp)) +
geom_line(color = "gray80") +
geom_line(aes(y = smooth_HR), na.rm = TRUE) +
geom_smooth(method = "gam", formula = y ~ splines::ns(x, 25))
I can apply your theme for the x-axis, though (1) I think having the date in there may be a waste of screen real-estate, and (2) ignoring the fact that it will cause a visual compression vertically of the data, I don't think it adds a lot of value to breaking out values and/or trends. If we facet by day, this may help break it out. I'm also introducing alldays
as a not-to-be-plotted geom so that the days with partial data will still be full days. I'm also changing the theme a bit, I think that this can take more attention depending on your preferences and your data.
alldays <- tibble(date = seq(min(as.Date(quux$timestamp)), max(as.Date(quux$timestamp)), by = "1 day")) |>
reframe(.by = date, timestamp = as.POSIXct(paste(date, c("00:00:00", "23:59:59")), tz = "UTC"),
heart_rate = NA_real_)
quux |>
mutate(
date = as.Date(timestamp),
grp = consecutive_id(c(FALSE, diff(timestamp) > 30))
) |>
mutate(.by = grp, smooth_HR = zoo::rollapply(heart_rate, 20, FUN = mean, align = "center", fill = NA)) |>
ggplot(aes(timestamp, heart_rate, group = grp)) +
facet_wrap(date ~ ., scales = "free_x", ncol = 1, strip.position = "left") +
geom_line(color = "gray80") +
geom_line(aes(y = smooth_HR), na.rm = TRUE) +
geom_smooth(method = "gam", formula = y ~ splines::ns(x, 25)) +
geom_blank(aes(group = NULL), data = alldays) +
scale_x_datetime(name = NULL, date_labels = "%H:%M:%S") +
scale_y_continuous(name = "BPM") +
theme_bw() +
theme(panel.grid.minor.y = element_blank())
Some wrap-up notes:
stats::smooth
, I'm confident there may be more appropriate methods to match the variability of this data and/or to achieve the results you want.