rlinear-regressionrasterr-raster

Linear regression between two raster images in R


I need a linear regression for calculating an empirical parameter. L1 is a raster image, format .tif. L2 is a raster image as well, calculated beforehand. Both images have the same number of columns and rows.

The formula is: L1 = a + b * L2 which translates in R as:

lm(L1 ~ L2)

In a second formula I later need a nd b.

I am now facing the problem, that both raster contain NA values and I not sure how to build the function for the linear regression. I am not that familiar with R so I am stuck at this problem which might be rather simple. I think I have to use calc, but not sure how.

Edit: So far I have this code:

s = stack(L1,L2)
fun = function(x) {if (is.na(x[1])) { NA } else {lm(x[1] ~ x[2])$coefficients[2]}}

However, it takes a really long time calculating and does not come to an result


Solution

  • You would use terra::app (raster::calc) if you wanted to do local regression, that is a separate regression for each grid cell (pixel) -- or use a specialized method for that: terra::regress. But that makes no sense in this case as you only have two rasters; and thus only one data point per grid cell.

    In your case, you appear to want a global regression. You can that like this:

    Example data

    library(terra)
    L1 <- rast(ncol=10, nrow=10, vals=runif(100), names="L1")
    L2 <- rast(ncol=10, nrow=10, vals=runif(100), names="L2")
    s <- c(L1, L2)
    

    Make a data frame and fit the model

    v <- data.frame(na.omit(values(s)))
    # this step may not be necessary
    names(v) <- c('L1', 'L2')
    m <- lm(L2 ~ L1, data=v)
    m
    

    if s is too large for that, you can do something like

    v <- spatSample(s, 100000, "regular")  
    v <- data.frame(na.omit(v))
    # etc.
    

    Now you can predict to another (or the same) raster

    p <- predict(s, m)
    residuals <- s$L2 - p