How to loop through columns in a Data Frame and cap the values at 97.5th percentile of that column?
Eg. if one particular column has values 1 to 100 filled in it, the value >97.5, i.e 98, 99 and 100 should all be given 97.5.
Please see, I want to do this for columns 4 to last in the data frame.
You can do this in one line in base R
#set up the data
df <- data.frame(a = sample(100,replace=TRUE),
b = sample(100,replace=TRUE),
c = sample(100,replace=TRUE))
df2 <- as.data.frame(lapply(df, function(x) pmin(x, quantile(x, 0.975))))
To just modify columns 4 to 10 (for example) of your dataframe, you could do
data[,4:10] <- as.data.frame(lapply(data[,4:10], function(x) pmin(x, quantile(x, 0.975))))