I am trying to reduce the size of my RMarkdown html report and more importantly, make them faster to open. The html report consists of a large number of R Plotly plots with each plot containing a large number of data points (1000+). Considering that R Plotly stores all of the raw data for each plot within the html file, I believed a good option to reduce the file size was to round the decimal places in the data. However, I found that even though the input data was rounded, R Plotly still maintains a large number of decimals places in the html file. Consequently the file size is not reduced if data is rounded.
See below for 2 cases, the base containing raw data, and the rounding case containing rounded data. The file size is the same for both cases.
Base Case HTML
RawData <- data.frame(Date = seq(as.Date("2024/1/1"), by = "month", length.out = 12),
PreciseValue = c(0.1516270, 0.3542629, 0.8339342, 0.5796813, 0.3933472, 0.2937137, 0.1779205, 0.4285533, 0.6841885, 0.3399411,0.99476560, 0.42941527))
RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
add_trace(x = ~Date, y = ~PreciseValue, name = 'PreciseValue')
saveWidget(fig, "plotly_base.html", selfcontained = TRUE)
The html file size is 3780kb. If I open the html file and look at the underlying R Plotly data, the stored y data is:
"y":[0.15162704353000001,0.35426295622999998,0.83393426323999997,0.57968136341999998,0.39334726234,0.29371352347000002,0.17792423404999999,0.44352285533000002,0.68418423485000002,0.36623994110000002,0.99476432455999997,0.42941523452699998]
Notice that there are more decimals places than in the original data.
Rounding Values Case
RawData$RoundValue <- round(RawData$PreciseValue,2)
fig <- plot_ly(RawData, type = 'scatter', mode = 'lines')%>%
add_trace(x = ~Date, y = ~RoundValue, name = 'RoundValue')
saveWidget(fig, "plotly_round.html", selfcontained = TRUE)
The html file size for the round case is also 3780kb. The underlying data for this case is
"y":[0.14999999999999999,0.34999999999999998,0.82999999999999996,0.57999999999999996,0.39000000000000001,0.28999999999999998,0.17999999999999999,0.44,0.68000000000000005,0.37,0.98999999999999999,0.42999999999999999]
The stored y data should be something like
"y":[0.15, 0.35, 0.83, 0.58, 0.39, 0.29, 0.18, 0.44, 0.68, 0.37, 0.99, 0.43]
Does anyone know how to configure R Plotly to only store the configured number of decimal places in html output?
plotly
uses htmlwidgets
methods to save its data. Part of that is that the object being plotted contains a function to save its data as JSON. The function used in your example is
> attr(fig$x, "TOJSON_FUNC")
function (x, ...)
{
jsonlite::toJSON(x, digits = 50, auto_unbox = TRUE, force = TRUE,
null = "null", na = "null", time_format = "%Y-%m-%d %H:%M:%OS6",
...)
}
<bytecode: 0x12084ae50>
<environment: namespace:plotly>
You can replace that with a different function, e.g. one that looks just the same, but only tries to save 2 significant digits instead of 50 significant digits:
attr(fig$x, "TOJSON_FUNC") <- function (x, ...)
{
jsonlite::toJSON(x, digits = 2, auto_unbox = TRUE, force = TRUE,
null = "null", na = "null", time_format = "%Y-%m-%d %H:%M:%OS6",
...)
}
saveWidget(fig, "plotly_base.html", selfcontained = TRUE)
When I do that, it saves the data with just 2 decimal places.
This doesn't make a huge difference to the file size since most of it is the plotly
Javascript code, but on a larger dataset (maybe your real one) it should help a bit.