rnetcdf4satellite

Combining .nc files and extracting selected variables


I have a similar question to u/Ananas here: Sentinel3 OLCI (chl) Average of netcdf files on Python I am running into similar problems, in so much that I cannot seem to extract the necessary information from the .nc-files and then merge them to create a time-series. In my case,I am trying to do this in R. My current code, which I have followed and customised from here: https://www.youtube.com/watch?v=jWRszWCVWLc&t=1504s , returns an error:

Error in `[<-.data.frame`(`*tmp*`, variable, value = c(0, 0, 0, 0, 0,  : 
  replacement has 1927 rows, data has 2202561

Maybe I am going at it the wrong way from the start and R-s capabilities wiht .nc files are not suited for this? Any suggestions are welcomed.

Here is my code

extract_variable_from_netcdf<- function(nc,variable){
  tryCatch(
    {
      result<-var.get.nc(nc,variable)
      return(result)
    },
    error=function(cond){
      message(paste(variable,"attribute not found"))
      message("Here is the original error message")
      message(cond)
    }
  )
}
extract_global_attribute_from_netcdf<- function(nc,global_attribute){
  tryCatch(
    {
      result<-att.get.nc(nc,"NC_GLOBAL",global_attribute)
      return(result)
    },
    error=function(cond){
      message(paste(global_attribute,"attribute not found"))
      message("Here is the original error message")
      message(cond)
    }
  )
}


folder<- "path to folder"
files<- list.files(folder, pattern= ".nc", full.names = TRUE)

variables<- c("conc_chl", "iop_bpart","lat", "lon") #variables I need to extract
global_attrs<- c("start_date", "stop_date")
headers<-c(global_attrs,variables)

df<-data.frame(matrix(ncol=length(headers), nrow=0))
colnames(df)<- headers
for(file in files) {
  nc<- open.nc(file)
  chl<- var.get.nc(nc, "conc_chl")
  num_chl<- length(chl)
  newdf<- data.frame(matrix(ncol=length(headers), nrow=num_chl))
  colnames(newdf)<- headers
 for (global_attribute in global_attrs) {
   newdf[global_attribute]<-extract_global_attribute_from_netcdf(nc,global_attribute)
 }
  
 for (variable in variables) {
  newdf[variable]<-extract_variable_from_netcdf(nc,variable)
}  

  df<-merge(df,newdf,all=TRUE)
}

Solution

  • The way I have used ".nc" files with satellite data, in R. Have been reading it in with the "raster" library as a raster file.

    library(raster)
    
    r <- raster("yuor_file.nc")
    plot(r) # quick plot to see if everything is as it should be
    

    The way I read in my timeseries was with a loop, and in addition I used a function found from this site somewhere, to covert the raster into a sensible r-data frame

    stack overflow function, to convert the loaded raster to data frame

    gplot_data <- function(x, maxpixels = 50000)  {
      x <- raster::sampleRegular(x, maxpixels, asRaster = TRUE)
      coords <- raster::xyFromCell(x, seq_len(raster::ncell(x)))
      ## Extract values
      dat <- utils::stack(as.data.frame(raster::getValues(x))) 
      names(dat) <- c('value', 'variable')
      
      dat <- dplyr::as.tbl(data.frame(coords, dat))
      
      if (!is.null(levels(x))) {
        dat <- dplyr::left_join(dat, levels(x)[[1]], 
                                by = c("value" = "ID"))
      }
      dat
    }
    

    Read in one file at a time, convert with function and return data.frame

    files<- list.files(folder, pattern= ".nc", full.names = TRUE)
    
    fun <- function(i) {
      #read in one file at a time
      r <- raster(files[i])
      
      #convert to normal data frame
      temp <- gplot_data(r)
      temp #output 
    }
    dat <- plyr::rbind.fill(lapply(1:length(files), fun)) #bind each iteration
    

    Here a plot using ggplot2 and ggforce.

    ggplot() +
      geom_tile(data = dat,
                aes(x = x, y = y, fill = value))
    

    Alternatively if you do not know the context of you file, the following, from the "ncdf4" package, will help you inspect it. https://towardsdatascience.com/how-to-crack-open-netcdf-files-in-r-and-extract-data-as-time-series-24107b70dcd

    library(ncdf4)
    our_nc_data <- nc_open("/your_file.nc")
    
    print(our_nc_data)
    
    # look for the variable names and assign them to vectors that can be bound together in dataframes
    lat <- ncvar_get(our_nc_data, "lat") #names of latitude column
    lon <- ncvar_get(our_nc_data, "lon") #name of longitude column
    
    time <- ncvar_get(our_nc_data, "time") #the time was called time
    tunits <- ncatt_get(our_nc_data, "time", "units")# check units
    
    lswt_array <- ncvar_get(our_nc_data, "analysed_sst") #select the relevant variable, this is temperature named "analysed_sst"