rkernel-density

In R, how do I map a boundary corrected KDE generated using evmix::dbckdem() to the values of the input data or uncorrected KDE?


Introduction

I am trying to generate a boundary corrected kernel density estimate of a set of values which has many zeroes but which cannot go below zero (percent cover of the land surface in trees - obviously a negative percent coverage is not possible). I have successfully used evmix::dbckdem() to generate what appears to be an appropriate series of Y values, but associated X values are unrelated X values of the initial data set.

Non-reproducible example, directly referential to my data:

> min(land_coverage);max(land_coverage) # raw input data ranges from 0 to 71.78746
[1] 0
[1] 71.78746
> d.land_coverage <- density(land_coverage)
> plot(d.land_coverage, main = "KDE of values in numeric \nvector 'land_coverage'")

enter image description here

> xp <- seq(0, 100, length.out = 512) # I think problem may lay in choice of values for this object
> d.land_coverage_bounded <- evmix::dbckden(xp, land_coverage, lambda = 1, bcmethod = "simple")
> plot(d.land_coverage_bounded) # Y values are transfored to equal or greater than 0

enter image description here

> str(d.land_coverage) # includes X and Y values
List of 7
 $ x        : num [1:512] -7.73 -7.56 -7.39 -7.22 -7.05 ...
 $ y        : num [1:512] 9.82e-05 1.22e-04 1.49e-04 1.83e-04 2.22e-04 ...
 $ bw       : num 2.58
 $ n        : int 636
 $ call     : language density.default(x = land_coverage)
 $ data.name: chr "land_coverage"
 $ has.na   : logi FALSE
 - attr(*, "class")= chr "density"
> str(d.land_coverage_bounded) # Numeric vector of Y values only, X have been lost
 num [1:512] 0.083 0.0733 0.0668 0.0626 0.0601 ...

Reproducible example, with dummy data:

> data_dummy <- rnorm(1000, 5, 2) 
> data_dummy[data_dummy < 0] <- 0 # ensure no values are < 0
> min(data_dummy);max(data_dummy) # print minimum and maximum values
[1] 0
[1] 10.72429
> plot(density(data)) # plot KDE, with some density below 0 

enter image description here

> xp <- seq(0, 10, 0.01)
> data_dummy_corrected <- dbckden(xp, data_dummy, bw =1, bcmethod = "simple")
> plot(data_dummy_corrected)

enter image description here

Question: How do I maintain density function mapping to initial X values when using evmix::dbckdem() to generate a boundary-corrected KDE?


Solution

  • You just need to estimate density on the same grid, then the two density vectors will be 'mapped' to the same x values.

    Set the length of the grid vector to a power of 2, as density() will round up to the nearest one regardless.

    xp <- seq(from = -2, to = ceiling(max(data_dummy)), len = 1024)
    d <- density(data_dummy, from = xp[1], to = rev(xp)[1], n = length(xp), bw = 1)$y
    d_cor <- dbckden(xp, data_dummy, bcmethod = "simple", bw = 1)
    results <- data.frame(x = xp, Density = d, corrected = d_cor)
    plot(Density~x, results, type = "l", col = '#FECD6C', ylim = c(0,0.2))
    lines(corrected~x, results, col = "#6c9dfe", lty = 2)
    legend("topleft", c("KDE", "corrected KDE"), col = c('#FECD6C',"#6c9dfe"),
    lty = c(1,2))
    

    Result