rgeographic-distance

R distance matrix with non-zero values on diagonal (rdist.earth )


The entries on the diagonal of a spatial distance matrix should be zero, bc they represent the distance between each location and itself. But the rdist.earth() function from the fields R package sometimes give me non-zeros on the diagonal:

> # Set number of decimals of output display
> options(digits=8)
> # Some longitude, latitude data
> LLdat
     lon      lat
 1 -105.85878 43.65797
 2 -105.81812 43.57009
 3 -105.80796 43.57748
 > 
 > # Create distance matrix
 > library(fields)
 > distmat <- rdist.earth(LLdat,LLdat)
 > distmat
      1              2          3
1 0.0000000 6.410948951394 6.12184338
2 6.4109490 0.000059058368 0.72150586
3 6.1218434 0.721505863563 0.00000000

In the above distance matrix, the second entry on the diagonal is 0.000059058368, in miles (the default unit), whereas the other two are 0.0000000. First, why do the entries of the second column show more digits than the other two? And why is the entry on the second diagonal not zero to 8 decimals like the others? The discrepancy doesn't seem small enough to attribute to floating point rounding error.

Now compare the output of rdist.earth() to that of a different package, geosphere, and function distGeo(), which computes the distance between two points (not a full distance matrix). Here, we compute the distance between each point and itself. The output vector units are in meters:

> library(geosphere)
> distmat2 <- distGeo(LLdat,LLdat)
> distmat2
[1] 0 0 0

So with distGeo(), all three distances measures are in agreement and appropriately zero.

Is there something I'm missing? Or does this indicate a problem with rdist.earth()?


Solution

  • Unfortunately, it is a rounding error.

    If you look at the source code, you can replicate the issue:

    x1 <- LLdat
    
    R <- 3963.34
    coslat1 <- cos((x1[, 2] * pi)/180)
    sinlat1 <- sin((x1[, 2] * pi)/180)
    coslon1 <- cos((x1[, 1] * pi)/180)
    sinlon1 <- sin((x1[, 1] * pi)/180)
    
    pp <- cbind(coslat1 * coslon1, coslat1 * sinlon1, sinlat1) %*% 
        t(cbind(coslat1 * coslon1, coslat1 * sinlon1, sinlat1))
    return_val = (R * acos(ifelse(abs(pp) > 1, 1 * sign(pp), pp)))
    

    The function first calcs an intermediate matrix pp:

    print (pp)
    
                 [,1]         [,2]         [,3]
    [1,] 1.0000000000 0.9999986917 0.9999988071
    [2,] 0.9999986917 1.0000000000 0.9999999834
    [3,] 0.9999988071 0.9999999834 1.0000000000
    

    It seems like the diagonal is all the same. However:

    print(pp, digits=22)
                             [,1]                     [,2]                     [,3]
    [1,] 1.0000000000000000000000 0.9999986917465573110775 0.9999988070789928018556
    [2,] 0.9999986917465573110775 0.9999999999999998889777 0.9999999834298258782894
    [3,] 0.9999988070789928018556 0.9999999834298258782894 1.0000000000000000000000
    
    
    > acos(0.9999999999999998889777) * R
    [1] 5.905836821e-05
    > acos(1.0000000000000000000000) * R
    [1] 0