rspatialspatstatqupath

Analysing exported .geojson annotations from Qupath in SpatStat in R


I am new to coding, with a background in life sciences/biology.

I am aiming to analyse the co-localisation of leukaemia cells to areas of blood vessels and adipocytes.

I have exported the cell locations as points with marks representing either CKIT positivity or DAPI-only positivity.

I have exported point patterns of the cells in Qupath and annotations of the locations of blood vessels/endomucin stained structures and adipocytes/perilipin stained structures as .geojson files from Qupath.

.geojson file of exported Qupath annotations

The window should be a square box from which I have sampled the tissue.

I have attached an image of the example of how I obtained the points and annotations from qupath. The endomucin and perilipin stained annotations are in white and green respectively, identified through qupath's pixel classifier which I trained. The points are the cells.

Qupath cell points and annotations

I would like the window to be the square outter box, the points as a multitype point pattern and the annotations for perilipin and endomucin staining to act as spatial covariates for the point pattern analysis. However, how do I best model these latter annotations? After importing them as a .geojson into R, converting to sf format, some the edges are touching when checking with st_is_valid.

Should I convert this to a pixel image or binary mask? Can they remain as polygons and be converted to owin objects? (such as the Murchison gold data in Baddeley's wonderful spatial point pattern analysis book). I have not had much luck converting these polygons to owin objects.

The other key part of this is the annotations need to retain their class as either perilipin or endomucin stained... Perhaps this would be easier as pixel images with factor levels?

I have attempted to import the .geojson files and convert them to spatstat based windows with the following code:

library(sf)
library(spatstat)
library(spatstat.geom)

setwd("~/RStudio/ImageAnalysis")

pt_geojson <- "Veh Tibia - Proximal Tibia.geojson"
annotations_sf <- st_read(pt_geojson)

# Check resulting sf object

head(annotations_sf)

annotations_transform <- st_transform(annotations_sf, crs = 26717)

annotations_owin <- as.owin(annotations_transform)

plot(annotations_owin)

Unfortunately this just generates the following error:

Error in if (!all(st_dimension(W) == 2)) stop("as.owin.sfc needs polygonal geometries") : 
  missing value where TRUE/FALSE needed

I added the following code:

# Check for invalid geometries
invalid_geometries <- st_is_valid(annotations_sf, reason = TRUE)
print(invalid_geometries)

# Fix invalid geometries
annotations_sf <- st_make_valid(annotations_sf)

# Verify geometries are now valid
valid_geometries <- st_is_valid(annotations_sf, reason = TRUE)
print(valid_geometries)

This resulted in the following:

> # Check for invalid geometries
> invalid_geometries <- st_is_valid(annotations_sf, reason = TRUE)
Warning messages:
1: In st_is_longlat(x) :
  bounding box has potentially an invalid value range for longlat data
2: In st_is_longlat(x) :
  bounding box has potentially an invalid value range for longlat data
> print(invalid_geometries)
 [1] "Valid Geometry"          "Edge 2 crosses edge 4"  
 [3] "Edge 16 crosses edge 20" "Valid Geometry"         
 [5] "Valid Geometry"          "Edge 14 crosses edge 20"
 [7] "Valid Geometry"          "Valid Geometry"         
 [9] "Edge 0 crosses edge 8"   "Edge 16 crosses edge 18"
[11] "Edge 6 crosses edge 8"   "Valid Geometry"         
[13] "Valid Geometry"          "Valid Geometry"         
[15] "Edge 0 crosses edge 26"  "Edge 4 crosses edge 10" 
[17] "Valid Geometry"          "Valid Geometry"         
[19] "Edge 12 crosses edge 18" "Valid Geometry"         
[21] "Edge 0 crosses edge 10"  "Valid Geometry"         
[23] "Valid Geometry"          "Valid Geometry"         
[25] "Edge 10 crosses edge 12" "Valid Geometry"         
[27] "Valid Geometry"          "Valid Geometry"         
[29] "Valid Geometry"          "Edge 4 crosses edge 24" 
> 
> # Fix invalid geometries
> annotations_sf <- st_make_valid(annotations_sf)
Warning message:
In st_is_longlat(x) :
  bounding box has potentially an invalid value range for longlat data
> 
> # Verify geometries are now valid
> valid_geometries <- st_is_valid(annotations_sf, reason = TRUE)
> print(valid_geometries)
 [1] "Valid Geometry"          "Edge 2 crosses edge 4"  
 [3] "Edge 16 crosses edge 20" "Valid Geometry"         
 [5] "Valid Geometry"          "Edge 14 crosses edge 20"
 [7] "Valid Geometry"          "Valid Geometry"         
 [9] "Edge 0 crosses edge 8"   "Edge 16 crosses edge 18"
[11] "Edge 6 crosses edge 8"   "Valid Geometry"         
[13] "Valid Geometry"          "Valid Geometry"         
[15] "Edge 0 crosses edge 26"  "Edge 4 crosses edge 10" 
[17] "Valid Geometry"          "Valid Geometry"         
[19] "Edge 12 crosses edge 18" "Valid Geometry"         
[21] "Edge 0 crosses edge 10"  "Valid Geometry"         
[23] "Valid Geometry"          "Valid Geometry"         
[25] "Edge 10 crosses edge 12" "Valid Geometry"         
[27] "Valid Geometry"          "Valid Geometry"         
[29] "Valid Geometry"          "Edge 4 crosses edge 24"

And rerunning the plot afterwards generated this horribly merged plot of all the annotations.

Failed polygonal window plot

Any help would be very kindly appreciated! Thank you :))

Edit

I think I have created a solution but I am unsure if it will cause issues later on down the line. I have come to realise that exporting the annotations as seperate merged polygons saves a lot of trouble and seems to work. I have also isolated the bounding box from the original .geojson export but I will simplify this by just exporting the bounding box from qupath directly in later attempts.

#Load required packages

library(sf)
library(spatstat)
library(spatstat.geom)

#Set working directory (CHANGE AS REQUIRED)

setwd("~/RStudio/ImageAnalysis")

#Import .geojson files, generated by Qupath

pt_geojson <- "Veh Tibia - Proximal Tibia.geojson"
pt_peri <- "pt_peri.geojson"
pt_endo <- "pt_endo.geojson"

#Generate sf data

ann_sf <- st_read(pt_geojson)
ann_peri <- st_read(pt_peri)
ann_endo <- st_read(pt_endo)

#Generate outer bounding box

bbox <- st_bbox(ann_sf)
bbox <- st_as_sfc(bbox)
bbox <- st_bbox(bbox[[1]])

xmin <- bbox[[1]]
xmax <- bbox[[3]]
ymin <- bbox[[2]]
ymax <- bbox[[4]]

W <- owin(xrange = c(xmin, xmax), yrange = c(ymin, ymax))

# Generate adipocyte and vascular annotations

peri <- ann_peri$geometry[[1]]
peri <- as.owin(peri)

endo <- ann_endo$geometry[[1]]
endo <- as.owin(endo)

com <- solist(W, endo, peri, pt_ppp)

plot(W, main = "Adipocytes + Vasculature")
plot(peri, col = "green", add = TRUE)
plot(endo, col = "red", add = TRUE)
plot(pt_ppp, cols = c("yellow3","skyblue1"), chars = c(15,16), 
     use.marks = TRUE, legend = TRUE, cex = 0.5, 
     main = "Proximal Tibia Cell Point Pattern", add = TRUE)

Solution

  • The first problem is that the geojson file is not in WGS84 geographic coordinates as st_read() implicitly assumes. This means that the computer thinks the coordinates are on a sphere and you have to transform them to a planar coordinate system to be able to get them into spatstat. This is done with the st_transform() function and that destroys the geometry of everything. To avoid the wrong WGS84 crs assumption, you can set crs = NA in st_read(). The following gives you a working example.

    library(sf)
    #> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1; sf_use_s2() is TRUE
    
    library(spatstat)
    #> Loading required package: spatstat.data
    #> Loading required package: spatstat.univar
    #> spatstat.univar 2.0-3.003
    #> Loading required package: spatstat.geom
    #> spatstat.geom 3.2-9.017
    #> Loading required package: spatstat.random
    #> spatstat.random 3.2-3.002
    #> Loading required package: spatstat.explore
    #> Loading required package: nlme
    #> spatstat.explore 3.2-7.005
    #> Loading required package: spatstat.model
    #> Loading required package: rpart
    #> spatstat.model 3.2-11.004
    #> Loading required package: spatstat.linnet
    #> spatstat.linnet 3.1-5.003
    #> 
    #> spatstat 3.0-8.004 
    #> For an introduction to spatstat, type 'beginner'
    
    pt_geojson <- "Veh Tibia - Proximal Tibia.geojson"
    annotations_sf <- st_read(pt_geojson, crs = NA)
    #> Reading layer `Veh Tibia - Proximal Tibia' from data source 
    #>   `/home/rubak/OneDrive/spatstat/experiments/QuPath/Veh Tibia - Proximal Tibia.geojson' 
    #>   using driver `GeoJSON'
    #> Simple feature collection with 30 features and 6 fields
    #> Geometry type: POLYGON
    #> Dimension:     XY
    #> Bounding box:  xmin: 31884 ymin: 3236 xmax: 32958 ymax: 4310
    #> CRS:           NA
    
    
    geo <- st_geometry(annotations_sf)
    windows <- lapply(geo, as.owin)
    df <- st_drop_geometry(annotations_sf)
    te <- tess(tiles=windows)
    marks(te) <- df
    
    te
    #> Tessellation
    #> Tiles are windows of general type
    #> 30 tiles (irregular windows)
    #> Tessellation has 6 columns of marks: 'id', 'objectType', 'classification', 
    #> 'isLocked', 'measurements' and 'name'
    #> window: rectangle = [31884, 32958] x [3236, 4310] units
    
    
    plot(te, do.col = FALSE, do.labels = TRUE, main = "")
    

    
    plot(tiles(te), main = "")
    

    Notice that the tiles are on different physical scales in the plot above. Tile number 12 doesn’t have a classification and it corresponds to the entire window of observation, so it would probably make more sense to leave it out.

    plot(tiles(te)[[12]], main = "")
    

    If we omit it we get the following:

    te2 <- tess(tiles=windows[-12])
    marks(te2) <- df[-12,]
    plot(te2, values = marks(te2)$classification, main = "")
    

    Notice: I didn’t spend time cleaning the data, so I left classification labels as they were in the original data. Also: Since the original data are pixels I think it would make sense to work with binary masks instead of polygons.