rstatisticsdistributiontheorem

How to use ChebyShev's Inequality in R


I have a statistical question in R and I was hoping to use Chebyshev inequality theorem, but I don't know how to implement it.

Example: Imagine a dataset with a nonnormal distribution, I need to be able to use Chebyshev's inequality theorem to assign NA values to any data point that falls within a certain lower bound of that distribution. For example, say the lower 5% of that distribution. This distribution is one-tailed with an absolute zero.

I am unfamiliar with how to go about this, as well as with what sort of example might help.

If it is helpful to know, this problem is stemming from a large amount of different datasets with all different types of distribution - all nonnormal. I need to be able to select a certain lower percentage of that distribution and assign NA values to them to discount them from the rest of the analysis. Will appreciate any help!

Thanks!


Solution

  • From the description "I need to be able to select a certain lower percentage of that distribution and assign NA values to them to discount them from the rest of the analysis," it sounds pretty simple:

    x <- runif(1000) # Simulate some data
    cutpt <- quantile(x,probs=.05)
    x[x<cutpt] <- NA