I have a dataframe as below where there are 3 columns, each representing a proportion of time spent in a singlular activity.
df <- data.frame(ID = c(1, 2, 3, 4),
(time_1 = c(0.2500, 0.2501, 0.2499, 0.2500),
(time_2 = c(0.5000, 0.5000, 0.5001, 0.5001),
(time_3 = c(0.2501, 0.2499, 0.5001, 0.2498),
(sum_time = c(1.0001, 1.0000, 1.0001, 0.9999))
ID time_1 time_2 time_3 sum_time
1 0.2500 0.5000 0.2501 1.0001
2 0.2501 0.5000 0.2499 1.0000
3 0.2499 0.5001 0.5001 1.0001
4 0.2500 0.5001 0.2498 0.9999
I intend to extract the compositional means of this data, however cannot do so if all of the values for sum_time
do not exactly equal 1.
I have attempted to round to fewer decimal places using round(data$time_1, digits = 3)
however this returns values of 0.999
and 1.001
in the instances that do not already equal 1.
I have also attempted to create a function whereby if the sum is either 1.0001
or 0.9999
then I subtract or add 0.0001
to one of the variables as the time difference in minutes is insignificant. However I cannot get these functions to work.
scale_compositions <- function(x){
if(df$sum_time== 1.0001) {df$time_1 - 0.0001}
if(df$sum_time == 0.9999) {df$time_1 + 0.0001}
}
scale_compositions(x)
Ideally I would be able to rescale those variables that equal 1.0001
and 0.9999
such that each of the time_
intervals is either increased or reduced by an appropriate amount to ensure the proportions displayed remain as accurate as possible but have been unable to figure this out so far. I have been playing around with the rescale
functions in various R packages to no avail currently.
Given the insignificance of the 0.0001
to the overall time being investigated, it is unlikely that removing or adding that value to ensure every proportion is equal to 1 will meaningfully impact results (although this will be tested) and I am happy to do that for the time being.
Any assistance would be greatly appreciated
I hope I haven't misunderstood your question but could this work?
df <- data.frame(
ID = c(1, 2, 3, 4),
time_1 = c(0.2500, 0.2501, 0.2499, 0.2500),
time_2 = c(0.5000, 0.5000, 0.5001, 0.5001),
time_3 = c(0.2501, 0.2499, 0.2500, 0.2498)
)
df$sum_time <- rowSums(df[, c("time_1", "time_2", "time_3")])
df$sum_time <- round(rowSums(df[, c("time_1", "time_2", "time_3")]), 3)
df