rncdf4r-stars

Crop netcdf files in R


I'm trying to crop a netcdf file with a polygon with stars package from daily netcdf data. I think I have managed to do it and could get this plot

enter image description here

with this script

library(tidyverse)
library(sf)
library(stars)

# Input nc file
nc.file <- "20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc"
# read nc data
nc.data <- read_ncdf(nc.file, var="analysed_sst")

# Read mask coordinates
coordenades.poligon <- read_csv("coordenades_poligon.csv")
colnames(coordenades.poligon) <- c("lon","lat")

# Build sf polygon to crop data
polygon <- coordenades.poligon %>%
  st_as_sf(coords = c("lon", "lat"), crs = 4326) %>%
  summarise(geometry = st_combine(geometry)) %>%
  st_cast("POLYGON")

# Crop data
nc.stars.crop <- st_crop(nc.data,polygon)

# plot
ggplot() + geom_stars(data=nc.stars.crop) +
  coord_equal() + theme_void() +
  scale_x_discrete(expand=c(0,0))+
  scale_y_discrete(expand=c(0,0))

Now I would like to combine lon, lat and analysed_sst in a data frame. I managed to extract coordinates with

nc.stars.coords <- as.data.frame(st_coordinates(nc.stars.crop))

But can't find how to get the corresponding sst values to cbind with longitude and latitude. Maybe there are other solutions with ncdf4 package.

Thank you very much for your help

EDIT 1

Link to SST original data (nc file): SST data

EDIT 2 Added head of coordenades_poligons.csv. First columns are longitude and latitude points, third column is the area ID and fourth one denotes the season. These are just the coordinates of a single area filtered by ID and season.

12.5,44.5,Z1,S
2,44.5,Z1,S
0,41.5,Z1,S
4,40,Z1,S
9,40,Z1,S
9,42,Z1,S
0,41.5,Z2,S

Solution

  • I am making assumptions here, because this is not my area of expertise, but you are able to simply transform this into a dataset using the raster-package. This seems to be the way to go, also according to this author.

    raster::as.data.frame(nc.stars.crop, xy = TRUE)
    

    At least for me this worked. And then you could transform it back into a simple features object, if you are so inclined with

    raster::as.data.frame(nc.stars.crop, xy = TRUE) %>% 
    sf::st_as_sf(coords = c('lon','lat'))
    

    However, the transformation to lon/lat is not really exact, because it produces point data, whereas the original information is raster data. So there is clearly information that gets lost.

    sf::st_as_sf() seems to work out of the box for this, but I am not sure, because I have no way to validate the transformation of the original data. For me the following worked:

    read_ncdf('20220301120000-NCEI-L4_GHRSST-SSTblend-AVHRR_OI-GLOB-v02.0-fv02.1.nc', var="analysed_sst") %>%
      sf::st_as_sf()
    

    This creates polygons, the size of your initial raster tiles and seems to conserve all necessary information.

    Finally, here is a work-around to extracting exactly the data you were plotting. You can access the data that ggplot used, by assigning the ggplot to a variable and then accessing the data layer.

    p <- ggplot() + geom_stars(data=nc.stars.crop) +
    coord_equal() + theme_void() +
    scale_x_discrete(expand=c(0,0))+
    scale_y_discrete(expand=c(0,0))
    
    p$layers[[1]]$data