rgeospatialgeostatisticscovariogramgeor

R_number of pairs for each lag in a Variogram


I am using geoR package for spatial interpolation of rainfall. I have to tell that I am quite new to geostatistics. Thanks to some video tutorials in youtube, I understood (well, I think so) the theory behind variogram. As per my understanding, the number of pairs should decrease with increasing lag distances. For eg, if we consider a 100m long stretch (say 100m long cross section of a river bed) the number of pairs for 5m lag is 20 and number of pairs for 10m lag is 10 and so on. But I am kind of confused with output from variog function in geoRpackage. An example is given below

mydata
          X      Y        a
[1,] 415720 432795 2.551415
[2,] 415513 432834 2.553177
[3,] 415325 432740 2.824652
[4,] 415356 432847 2.751844
[5,] 415374 432858 2.194091
[6,] 415426 432774 2.598897
[7,] 415395 432811 2.699066
[8,] 415626 432762 2.916368

this is my dataset where a is my variable (rainfall intensity) and x, y are the coordinates of the points. The varigram calculation is shown below

geodata=as.geodata(data,header=TRUE)
variogram=variog(geodata,coords=geodata$coords,data=geodata$data)
variogram[1:3]
$u
[1]  46.01662 107.37212 138.04987 199.40537 291.43861 352.79411

$v
[1] 0.044636453 0.025991469 0.109742986 0.029081575 0.006289056 0.041963076

$n
[1] 3 8 3 3 3 2

where

u: a vector with distances.

v: a vector with estimated variogram values at distances given in u.

n: number of pairs in each bin

According to this, number of pairs (n) have a random pattern whereas corresponding lag distance (u) is increasing. I find it hard to understand this. Can anyone explain what is happening? Also any suggestions/advice to improve the variogram calculation for this application (spatial interpolation of rainfall intensity) is highly appreciated as I am new to geostatistics. Thanks in advance.


Solution

  • On a linear transect of 100 m with 5 m regular spacing between observations, if you'd have 20 pairs at 5 m lag, you'd have 19 pairs at 10 m lag. This idea does not hold for your data, because they are irregularly distributed, and they are distributed over two dimensions. For irregularly distributed data, you often have very few point pairs for the very short distances. The advice for obtaining a better looking variogram is to work with a larger data set: geostatistics starts getting interesting with 30 observations, and fun with over 100 observations.