I downloaded discharge data for a river from a government website, they had formatted the date and time data as so please see format.
This is my code
library(ggplot2)
ggplot(CHEM_RESULTS, aes(x= `Date and Time`, y=`Discharge (cumec)`, group = 1)) +
geom_line( color="powderblue", size=1, alpha=0.9, linetype=1)
I produced this graph please see graph .
DATA SAMPLE:
head(CHEM_RESULTS)
Date and Time
<chr>
Discharge (cumec)
<dbl>
2024-03-05T00:00:01.000+10:00 3.202
2024-03-05T00:35:01.000+10:00 3.124
2024-03-05T01:00:01.000+10:00 3.040
2024-03-05T01:30:01.000+10:00 2.956
2024-03-05T02:00:01.000+10:00 2.919
2024-03-05T03:00:01.000+10:00 2.867
I think due to the format of the date and time being so long and having so many entries(1896) it is creating the bar on the x axis rather than displaying the data. I do not think all data needs to be shown but some date/time points are needed to provide context. I think it may be challenging to reformat the way the government has given the date/ time data, again given how many entries there are.
I need to overlay other water quality data onto the graph e.g. pH at 4 sites and 4 different time periods. Once I put these points onto the graph will it highlight them? as that would be useful in providing only the necessary date and time information.
any help on how to approach this is greatly appreciated.
Thank you !
tried to make a line graph of river discharge getting a bar on the x axis instead of displaying time stamps
I've generated a similar data structure with Montjean station between 01-01-2022 and 12-31-2023 (source : GRDC).
First lines (head(data)
):
# A tibble: 6 × 2
`Date and Time` `Discharge (cumec)`
<chr> <dbl>
1 2022-01-02T09:00:00.000+10:00 1928.
2 2022-01-03T09:00:00.000+10:00 2042.
3 2022-01-04T09:00:00.000+10:00 2161.
4 2022-01-05T09:00:00.000+10:00 2274.
5 2022-01-06T09:00:00.000+10:00 2227.
6 2022-01-07T09:00:00.000+10:00 2052.
To reproduce your error :
### Packages
library(dplyr)
library(lubridate)
library(ggplot2)
### Plot the graph without specifying breaks for the abscissa axis
ggplot(data, aes(x= `Date and Time`, y=`Discharge (cumec)`, group = 1)) +
geom_line(color="powderblue", linewidth=1, alpha=0.9, linetype=1)
To fix this :
### Transform the first column to POSIX time :
data=data %>%
mutate(`Date and Time`=ymd_hms(`Date and Time`,tz="Etc/GMT-10"))
### Plot the graph with ggplot2 with `scale_x_date_time` and `date_breaks`
ggplot(data, aes(x= `Date and Time`, y=`Discharge (cumec)`, group = 1)) +
scale_x_datetime(date_breaks = "3 months", date_labels = "%b %Y",limits = c(min(data$`Date and Time`), max(data$`Date and Time`)), expand = c(0, 0)) +
geom_line(color="powderblue", linewidth=1, alpha=0.9, linetype=1)
Output :